What does it mean to be language-agnostic? Probing multilingual sentence encoders for typological properties [article]

Rochelle Choenni, Ekaterina Shutova
<span title="2020-09-27">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Multilingual sentence encoders have seen much success in cross-lingual model transfer for downstream NLP tasks. Yet, we know relatively little about the properties of individual languages or the general patterns of linguistic variation that they encode. We propose methods for probing sentence representations from state-of-the-art multilingual encoders (LASER, M-BERT, XLM and XLM-R) with respect to a range of typological properties pertaining to lexical, morphological and syntactic structure. In
more &raquo; ... addition, we investigate how this information is distributed across all layers of the models. Our results show interesting differences in encoding linguistic variation associated with different pretraining strategies.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2009.12862v1">arXiv:2009.12862v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ev3eo6zu2zhjfd6gwst4weizkm">fatcat:ev3eo6zu2zhjfd6gwst4weizkm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201014181051/https://arxiv.org/pdf/2009.12862v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2009.12862v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>