Filters








12,538 Hits in 6.4 sec

Meta Distant Transfer Learning for Pre-trained Language Models

Chengyu Wang, Haojie Pan, Minghui Qiu, Jun Huang, Fei Yang, Yin Zhang
2021 Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing   unpublished
With the wide availability of Pre-trained Language Models (PLMs), multi-task fine-tuning across domains has been extensively applied.  ...  Inspired by meta-learning, we propose the Meta Distant Transfer Learning (Meta-DTL) framework to learn the cross-task knowledge for PLM-based methods.  ...  We thank the anonymous reviewers for their helpful comments.  ... 
doi:10.18653/v1/2021.emnlp-main.768 fatcat:rj5227jorbge5olflk2stzqqhy

Soft Layer Selection with Meta-Learning for Zero-Shot Cross-Lingual Transfer [article]

Weijia Xu, Batool Haider, Jason Krone, Saab Mansour
2021 arXiv   pre-print
In this paper, we propose a novel meta-optimizer to soft-select which layers of the pre-trained model to freeze during fine-tuning.  ...  We train the meta-optimizer by simulating the zero-shot transfer scenario.  ...  Our meta-optimizer learns the update rate for each layer by simulating the zero-shot transfer scenario where the model fine-tuned on the source languages is tested on an unseen language.  ... 
arXiv:2107.09840v1 fatcat:k2zkvd56q5ghnj6i3cjen6udxe

X-METRA-ADA: Cross-lingual Meta-Transfer Learning Adaptation to Natural Language Understanding and Question Answering [article]

Meryem M'hamdi, Doo Soon Kim, Franck Dernoncourt, Trung Bui, Xiang Ren, Jonathan May
2021 arXiv   pre-print
Recently, meta-learning has garnered attention as a promising technique for enhancing transfer learning under low-resource scenarios: particularly for cross-lingual transfer in Natural Language Understanding  ...  In this work, we propose X-METRA-ADA, a cross-lingual MEta-TRAnsfer learning ADAptation approach for NLU.  ...  The meta-train stage transfers from the source to the target languages, while the meta-adaptation further adapts the model to the target language.  ... 
arXiv:2104.09696v2 fatcat:yt6um3pmbrf5zh7bdclsu5duoi

Meta-Learning for Fast Cross-Lingual Adaptation in Dependency Parsing [article]

Anna Langedijk, Verna Dankers, Phillip Lippe, Sander Bos, Bryan Cardenas Guevara, Helen Yannakoudakis, Ekaterina Shutova
2022 arXiv   pre-print
We find that meta-learning with pre-training can significantly improve upon the performance of language transfer and standard supervised learning baselines for a variety of unseen, typologically diverse  ...  We train our model on a diverse set of languages to learn a parameter initialization that can adapt quickly to new languages.  ...  Zero-shot transfer, how-ever, is most successful among typologically similar, high-resource languages, and less so for languages distant from the training languages and in resource-lean scenarios (Lauscher  ... 
arXiv:2104.04736v3 fatcat:yptu2e7lkzhu3evfn4yzgvc67y

Hierarchical Meta-Embeddings for Code-Switching Named Entity Recognition

Genta Indra Winata, Zhaojiang Lin, Jamin Shin, Zihan Liu, Pascale Fung
2019 Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)  
We also show that, in cross-lingual settings, our model not only leverages closely related languages, but also learns from languages with different roots.  ...  Therefore, we propose Hierarchical Meta-Embeddings (HME) that learn to combine multiple monolingual word-level and subword-level embeddings to create language-agnostic lexical representations.  ...  We sincerely thank the three anonymous reviewers for their insightful comments on our paper.  ... 
doi:10.18653/v1/d19-1360 dblp:conf/emnlp/WinataLSLF19 fatcat:h545sr4gmrcp5kv2nbe3uab2ji

Hierarchical Meta-Embeddings for Code-Switching Named Entity Recognition [article]

Genta Indra Winata, Zhaojiang Lin, Jamin Shin, Zihan Liu, Pascale Fung
2019 arXiv   pre-print
We also show that, in cross-lingual settings, our model not only leverages closely related languages, but also learns from languages with different roots.  ...  Therefore, we propose Hierarchical Meta-Embeddings (HME) that learn to combine multiple monolingual word-level and subword-level embeddings to create language-agnostic lexical representations.  ...  We sincerely thank the three anonymous reviewers for their insightful comments on our paper.  ... 
arXiv:1909.08504v1 fatcat:r6uysqfoubfsngizz6v4t3apea

Enhanced Meta-Learning for Cross-Lingual Named Entity Recognition with Minimal Resources

Qianhui Wu, Zijia Lin, Guoxin Wang, Hui Chen, Börje F. Karlsson, Biqing Huang, Chin-Yew Lin
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
To this end, we present a meta-learning algorithm to find a good model parameter initialization that could fast adapt to the given test case and propose to construct multiple pseudo-NER tasks for meta-training  ...  While all existing methods directly transfer from source-learned model to a target language, in this paper, we propose to fine-tune the learned model with a few similar examples given a test case, which  ...  except that it is pre-trained on concatenated Wikipedia data of 104 languages.  ... 
doi:10.1609/aaai.v34i05.6466 fatcat:aq4zussw7ncyvc7dzo2olnv4xu

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources [article]

Qianhui Wu, Zijia Lin, Guoxin Wang, Hui Chen, Börje F. Karlsson, Biqing Huang, Chin-Yew Lin
2020 arXiv   pre-print
To this end, we present a meta-learning algorithm to find a good model parameter initialization that could fast adapt to the given test case and propose to construct multiple pseudo-NER tasks for meta-training  ...  While all existing methods directly transfer from source-learned model to a target language, in this paper, we propose to fine-tune the learned model with a few similar examples given a test case, which  ...  except that it is pre-trained on concatenated Wikipedia data of 104 languages.  ... 
arXiv:1911.06161v2 fatcat:zzwpoo5rz5g4tbmldwvuxiovtu

MetaXL: Meta Representation Transformation for Low-resource Cross-lingual Learning [article]

Mengzhou Xia, Guoqing Zheng, Subhabrata Mukherjee, Milad Shokouhi, Graham Neubig, Ahmed Hassan Awadallah
2021 arXiv   pre-print
The combination of multilingual pre-trained representations and cross-lingual transfer learning is one of the most effective methods for building functional NLP systems for low-resource languages.  ...  However, for extremely low-resource languages without large-scale monolingual corpora for pre-training or sufficient annotated data for fine-tuning, transfer learning remains an under-studied and challenging  ...  Acknowledgements We thank the anonymous reviewers for their constructive feedback, and Wei Wang for valuable discussions.  ... 
arXiv:2104.07908v1 fatcat:xqeubatcxzhqjhqs2shdjoh5py

A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios [article]

Michael A. Hedderich, Lukas Lange, Heike Adel, Jannik Strötgen, Dietrich Klakow
2021 arXiv   pre-print
Motivated by the recent fundamental changes towards neural models and the popular pre-train and fine-tune paradigm, we survey promising approaches for low-resource natural language processing.  ...  This includes mechanisms to create additional labeled data like data augmentation and distant supervision as well as transfer learning settings that reduce the need for target supervision.  ...  Zero-shot reading comprehension by cross- bert: A pre-trained language model for low resource lingual transfer learning with multi-lingual lan- nuclear domain. arXiv preprint arXiv:  ... 
arXiv:2010.12309v3 fatcat:26dwmlkmn5auha2ob2qdlrvla4

Meta-Transfer Learning for Low-Resource Abstractive Summarization [article]

Yi-Syuan Chen, Hong-Han Shuai
2021 arXiv   pre-print
In this paper, we propose to utilize two knowledge-rich sources to tackle this problem, which are large pre-trained models and diverse existing corpora.  ...  However, when encountering novel tasks, one may not always benefit from transfer learning due to the domain shifting problem, and overfitting could happen without adequate labeled examples.  ...  Acknowledgements We are grateful to the National Center for High-performance Computing for computer time and facilities.  ... 
arXiv:2102.09397v2 fatcat:5ttypkj2yfb3zn4i6vt4fzwfsy

Transfer Learning in Natural Language Processing

Sebastian Ruder, Matthew E. Peters, Swabha Swayamdipta, Thomas Wolf
2019 Proceedings of the 2019 Conference of the North  
layers or modules inside the pre-trained model.  ...  He has opensourced several widely used libraries for coreference resolution and transfer learning models in NLP and maintains a blog with practical tips for training large-scale transfer-learning and metalearning  ... 
doi:10.18653/v1/n19-5004 dblp:conf/naacl/RuderPSW19 fatcat:g5ynzyjgabbohklhonqofws26a

Low-Resource Adaptation of Neural NLP Models [article]

Farhad Nooralahzadeh
2020 arXiv   pre-print
To this end, we study distant supervision and sequential transfer learning in various low-resource settings.  ...  Real-world applications of natural language processing (NLP) are challenging. NLP models rely heavily on supervised machine learning and require large amounts of annotated data.  ...  In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.  ... 
arXiv:2011.04372v1 fatcat:626mbe5ba5bkdflv755o35u5pq

Knowledge Extraction in Low-Resource Scenarios: Survey and Perspective [article]

Shumin Deng, Ningyu Zhang, Hui Chen, Feiyu Xiong, Jeff Z. Pan, Huajun Chen
2022 arXiv   pre-print
, and (3) exploiting data and models together.  ...  In addition, we describe promising applications and outline some potential directions for future research.  ...  relations to target relations. (2) Pre-trained Language Representations Transfer learning based on pre-trained language representations uses pre-trained language representations that are trained on unlabeled  ... 
arXiv:2202.08063v1 fatcat:2q64tx2mzne53gt24adi6ymj7a

Improving Cross-Lingual Transfer Learning for End-to-End Speech Recognition with Speech Translation [article]

Changhan Wang, Juan Pino, Jiatao Gu
2020 arXiv   pre-print
Pre-trained or jointly trained encoder-decoder models, however, do not share the language modeling (decoder) for the same language, which is likely to be inefficient for distant target languages.  ...  Transfer learning from high-resource languages is known to be an efficient way to improve end-to-end automatic speech recognition (ASR) for low-resource languages.  ...  Pre-trained or jointly trained encoder-decoder models, however, do not share the language modeling (decoder) for the same language, which is likely to be inefficient for distant tar- get languages.  ... 
arXiv:2006.05474v2 fatcat:2gazgm3nijbrffsvy5zp5ocjna
« Previous Showing results 1 — 15 out of 12,538 results