Filters








790 Hits in 4.0 sec

Unsupervised and Few-shot Parsing from Pretrained Language Models [article]

Zhiyuan Zeng, Deyi Xiong
2022 arXiv   pre-print
Experiments on cross-lingual parsing show that both unsupervised and few-shot parsing methods are better than previous methods on most languages of SPMRL [Seddah et al., 2013].  ...  We therefore extend the unsupervised models to few-shot parsing models (FPOA, FPIO) that use a few annotated trees to learn better linear projection matrices for parsing.  ...  Acknowledgements The present research was supported by the National Key Research and Development Program of China (Grant No. 2019QY1802).  ... 
arXiv:2206.04980v1 fatcat:a32qyrwkfjamnhe4g3t24ytzsq

Can Multilingual Language Models Transfer to an Unseen Dialect? A Case Study on North African Arabizi [article]

Benjamin Muller and Benoit Sagot and Djamé Seddah
2020 arXiv   pre-print
Focusing on two tasks, part-of-speech tagging and dependency parsing, we show in zero-shot and unsupervised adaptation scenarios that multilingual language models are able to transfer to such an unseen  ...  dialect, specifically in two extreme cases: (i) across scripts, using Modern Standard Arabic as a source language, and (ii) from a distantly related language, unseen during pretraining, namely Maltese  ...  This project also received support from the French Ministry of Industry and Ministry of Foreign Affairs via the PHC Maimonide France-Israel cooperation programme.  ... 
arXiv:2005.00318v1 fatcat:iii52cuwcbg3nbrtmfvk2slr5m

When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models [article]

Benjamin Muller and Antonis Anastasopoulos and Benoît Sagot and Djamé Seddah
2021 arXiv   pre-print
Some languages greatly benefit from transfer learning and behave similarly to closely related high resource languages whereas others apparently do not.  ...  Transfer learning based on pretraining language models on a large amount of raw data has become a new norm to reach state-of-the-art performance in NLP.  ...  Antonios Anastasopoulos is generously supported by NSF Award 2040926 and is also thankful to Graham Neubig for very insightful initial discussions on this research direction. .  ... 
arXiv:2010.12858v2 fatcat:w467spjj2vezdjohoiw4of4ydu

Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation [article]

Xinyi Wang, Sebastian Ruder, Graham Neubig
2022 arXiv   pre-print
The performance of multilingual pretrained models is highly dependent on the availability of monolingual or parallel text present in a target language.  ...  Thus, the majority of the world's languages cannot benefit from recent progress in NLP as they have no or limited textual data.  ...  The authors would like to thank Aditi Chaudhary, Arya McCarthy, Shruti Rijhwani for discussions about the project, and Daan van Esch for the general feedback and pointing out additional linguistic resources  ... 
arXiv:2203.09435v2 fatcat:oki5ksv6cfal5efp5qcxnqqp4m

Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing [article]

Tal Schuster, Ori Ram, Regina Barzilay, Amir Globerson
2019 arXiv   pre-print
Our experimental results demonstrate the effectiveness of this approach for zero-shot and few-shot learning of dependency parsing.  ...  We introduce a novel method for multilingual transfer that utilizes deep contextual embeddings, pretrained in an unsupervised fashion.  ...  Acknowledgements We thank the MIT NLP group and the reviewers for their helpful discussion and comments. 11 We filtered words with multiple translations to the most common one by Google Translate.  ... 
arXiv:1902.09492v2 fatcat:hlhz4gbikfaejpucvcfhptoblq

Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing

Tal Schuster, Ori Ram, Regina Barzilay, Amir Globerson
2019 Proceedings of the 2019 Conference of the North  
Our experimental results demonstrate the effectiveness of this approach for zero-shot and few-shot learning of dependency parsing.  ...  We introduce a novel method for multilingual transfer that utilizes deep contextual embeddings, pretrained in an unsupervised fashion.  ...  Acknowledgements We thank the MIT NLP group and the reviewers for their helpful discussion and comments. 11 We filtered words with multiple translations to the most common one by Google Translate.  ... 
doi:10.18653/v1/n19-1162 dblp:conf/naacl/SchusterRBG19 fatcat:ns2bxzatkjdovnyqzxegtw53i4

Training Naturalized Semantic Parsers with Very Little Data [article]

Subendhu Rongali, Konstantine Arkoudas, Melanie Rubino, Wael Hamza
2022 arXiv   pre-print
We show that this method delivers new SOTA few-shot performance on the Overnight dataset, particularly in very low-resource settings, and very compelling few-shot results on a new semantic parsing dataset  ...  This approach delivers strong results, particularly for few-shot semantic parsing, which is of key importance in practice and the focus of our paper.  ...  We analyze our model design and results in another series of experiments and show the effectiveness of our approach in constructing a robust, well-performing semantic parsing model.  ... 
arXiv:2204.14243v2 fatcat:fl4xc7l7ybfr5ltsd7yi5dgime

Substructure Distribution Projection for Zero-Shot Cross-Lingual Dependency Parsing [article]

Haoyue Shi, Kevin Gimpel, Karen Livescu
2021 arXiv   pre-print
In addition, SubDP improves zero-shot cross-lingual dependency parsing with very few (e.g., 50) supervised bitext pairs, across a broader range of target languages.  ...  (s), and train a target language parser to fit the resulting distributions.  ...  10 We may here intuitively view that the dummy position We study syntax and everything about it 我們 研究 句法 和 相關的 一切  ... 
arXiv:2110.08538v1 fatcat:oyjmwh7ymfbmnmapprmpylyp2q

On the Inductive Bias of Masked Language Modeling: From Statistical to Syntactic Dependencies [article]

Tianyi Zhang, Tatsunori Hashimoto
2021 arXiv   pre-print
Recent theories have suggested that pretrained language models acquire useful inductive biases through masks that implicitly act as cloze reductions for downstream tasks.  ...  We construct cloze-like masks using task-specific lexicons for three different classification datasets and show that the majority of pretrained performance gains come from generic masks that are not associated  ...  When tested by cloze reductions pretrained MLMs and left-to-right language models (LMs) have been shown to possess abundant factual knowledge (Petroni et al., 2019) and display impressive few-shot ability  ... 
arXiv:2104.05694v1 fatcat:ib424wuccjbwrk22gztudr7boi

How Do Multilingual Encoders Learn Cross-lingual Representation? [article]

Shijie Wu
2022 arXiv   pre-print
From an engineering perspective, multilingual NLP benefits development and maintenance by serving multiple languages with a single system.  ...  In exploring these questions, this thesis will analyze the behavior of multilingual models in a variety of settings on high and low resource languages.  ...  Zhao et al. (2021) finds that few-shot overall improves over zero-shot and it is important to first train on source language then fine-tune with the few-shot target language example.  ... 
arXiv:2207.05737v1 fatcat:j6vfurgdhvhm5evwaqjhf4b3lu

Emerging Cross-lingual Structure in Pretrained Language Models [article]

Shijie Wu and Alexis Conneau and Haoran Li and Luke Zettlemoyer and Veselin Stoyanov
2020 arXiv   pre-print
We study the problem of multilingual masked language modeling, i.e. the training of a single model on concatenated text from multiple languages, and present a detailed study of several factors that influence  ...  For multilingual masked language modeling, these symmetries seem to be automatically discovered and aligned during the joint training process.  ...  XLM shows that cross-lingual language model pretraining leads to a new state of the art on XNLI , supervised and unsupervised machine translation .  ... 
arXiv:1911.01464v3 fatcat:zcutmu7pq5hyxmpirkf243v6jy

Self-training Improves Pre-training for Natural Language Understanding [article]

Jingfei Du, Edouard Grave, Beliz Gunel, Vishrav Chaudhary, Onur Celebi, Michael Auli, Ves Stoyanov, Alexis Conneau
2020 arXiv   pre-print
Finally, we also show strong gains on knowledge-distillation and few-shot learning.  ...  Unsupervised pre-training has led to much recent progress in natural language understanding.  ...  Few-shot learning experiments We investigate the effectiveness of our approach in the context of few-shot learning.  ... 
arXiv:2010.02194v1 fatcat:i4btr6525zb7pe3dd2bfjum3ra

Zero-shot Cross-lingual Transfer is Under-specified Optimization [article]

Shijie Wu, Benjamin Van Durme, Mark Dredze
2022 arXiv   pre-print
generalization error reduces smoothly and linearly as we move from the monolingual to bilingual model, suggesting that the model struggles to identify good solutions for both source and target languages  ...  Pretrained multilingual encoders enable zero-shot cross-lingual transfer, but often produce unreliable models that exhibit high performance variance on the target language.  ...  The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of ODNI, IARPA, or the  ... 
arXiv:2207.05666v1 fatcat:6vywifupv5dlzhxtqs5i5u3vde

Out of Thin Air: Is Zero-Shot Cross-Lingual Keyword Detection Better Than Unsupervised? [article]

Boshko Koloski and Senja Pollak and Blaž Škrlj and Matej Martinc
2022 arXiv   pre-print
We find that the pretrained models fine-tuned on a multilingual corpus covering languages that do not appear in the test set (i.e. in a zero-shot setting), consistently outscore unsupervised models in  ...  More specifically, we explore whether pretrained multilingual language models can be employed for zero-shot cross-lingual keyword extraction on low-resource languages with limited or no available labeled  ...  For further work we propose exploring few-shot shot scenarios, in which a small amount of target language data will be added to the multilingual train set.  ... 
arXiv:2202.06650v1 fatcat:gq5cihf2xbavhozjv5xbbvlhf4

Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation [article]

Goran Glavaš, Ivan Vulić
2021 arXiv   pre-print
Results from both monolingual English and zero-shot language transfer experiments (with intermediate target-language parsing) show that explicit formalized syntax, injected into transformers through IPT  ...  The recent advent of end-to-end neural models, self-supervised via language modeling (LM), and their success on a wide range of LU tasks, however, questions this belief.  ...  The work of Ivan Vulić is supported by the ERC Consolidator Grant LEXICAL: Lexical Acquisition Across Languages (no. 648909).  ... 
arXiv:2008.06788v2 fatcat:gea36ofyyffcnex5rtjiwucfaa
« Previous Showing results 1 — 15 out of 790 results