Filters








554 Hits in 3.6 sec

Cross-lingual Adaption Model-Agnostic Meta-Learning for Natural Language Understanding [article]

Qianying Liu, Fei Cheng, Sadao Kurohashi
2021 arXiv   pre-print
However, previous studies sample the meta-training and meta-testing data from the same language, which limits the ability of the model for cross-lingual transfer.  ...  Meta learning with auxiliary languages has demonstrated promising improvements for cross-lingual natural language processing.  ...  size β, θ ← θ − β∇ θ τ i ∼p(τ ) L (1) τ i (f θ−α∇ θ L (0) τ i (f θ ) ) (6) 3 Cross-lingual Adaption Model-Agnostic Meta-Learning Meta Task Formulation Cross-lingual Natural Language Understanding aims  ... 
arXiv:2111.05805v1 fatcat:rr4ff4w24fdo5lyxkwsspx5lum

On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment [article]

Zirui Wang, Zachary C. Lipton, Yulia Tsvetkov
2020 arXiv   pre-print
Motivated by these observations, we also present a meta-learning algorithm that obtains better cross-lingual transferability and alleviates negative interference, by adding language-specific layers as  ...  meta-parameters and training them in a manner that explicitly improves shared layers' generalization on all languages.  ...  Multilingual language models such as mBERT (Devlin et al., 2018) and XLM (Lample and Conneau, 2019) have been proven effective for cross-lingual transfer learning by pretraining a single shared Transformer  ... 
arXiv:2010.03017v1 fatcat:d6c5lm4lcvawll6fexjcha2mne

Combining Pretrained High-Resource Embeddings and Subword Representations for Low-Resource Languages [article]

Machel Reid, Edison Marrese-Taylor, Yutaka Matsuo
2020 arXiv   pre-print
In our exploration, we show that a meta-embedding approach combining both pretrained and morphologically-informed word embeddings performs best in the downstream task of Xhosa-English translation.  ...  To help circumvent this issue, we explore techniques exploiting the qualities of morphologically rich languages (MRLs), while leveraging pretrained word vectors in well-resourced languages.  ...  Learning of cross-lingual word embeddings (Ruder, 2017) and projecting monolingual word embeddings to a single cross-lingual embedding space (Artetxe et al., 2018) have also been proposed to help learn  ... 
arXiv:2003.04419v3 fatcat:u32hmrgyfrevtjver5cjikfphu

From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers [article]

Anne Lauscher and Vinit Ravishankar and Ivan Vulić and Goran Glavaš
2020 arXiv   pre-print
Massively multilingual transformers pretrained with language modeling objectives (e.g., mBERT, XLM-R) have become a de facto default transfer paradigm for zero-shot cross-lingual transfer in NLP, offering  ...  In this work, we analyze their limitations and show that cross-lingual transfer via massively multilingual transformers, much like transfer via cross-lingual word embeddings, is substantially less effective  ...  observe the following phenomenon working with XLM-R: for a fixed model capacity, the cross-lingual transfer performance improves when adding more pretraining languages only up to a certain point.  ... 
arXiv:2005.00633v1 fatcat:hwaf2biiszhwbf5cwuyhfvycvi

Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data [article]

Haitong Zhang, Yue Lin
2021 arXiv   pre-print
Recently, sequence-to-sequence (seq-to-seq) models have been successfully applied in text-to-speech (TTS) to synthesize speech for single-language text.  ...  To synthesize speech for multiple languages usually requires multi-lingual speech from the target speaker.  ...  [18] introduces meta learning to improve multi-lingual TTS based on [17] .  ... 
arXiv:2110.07210v1 fatcat:jqtfnrqzebdupmflrsey7iyuse

Small and Practical BERT Models for Sequence Labeling

Henry Tsai, Jason Riesa, Melvin Johnson, Naveen Arivazhagan, Xin Li, Amelia Archer
2019 Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)  
We show that our model especially outperforms on low-resource languages, and works on codemixed input text without being explicitly trained on codemixed examples.  ...  We showcase the effectiveness of our method by reporting on part-of-speech tagging and morphological prediction on 70 treebanks and 48 languages.  ...  BERT Cross-Lingual Transfer While Wu and Dredze (2019) shows effective zero-shot crosslingual transfer from English to other high-resource languages, we show that cross-lingual transfer is even effective  ... 
doi:10.18653/v1/d19-1374 dblp:conf/emnlp/TsaiRJALA19 fatcat:5azv4mjhkng5dm6iuqmqlwsc2m

Know Where You're Going: Meta-Learning for Parameter-Efficient Fine-tuning [article]

Mozhdeh Gheini, Xuezhe Ma, Jonathan May
2022 arXiv   pre-print
language model frozen.  ...  in gains of up to 1.7 points on cross-lingual NER fine-tuning.  ...  As a testbed, we experiment with cross-lingual NER.  ... 
arXiv:2205.12453v1 fatcat:jqdrcaonireyfj6z5irxvv6hva

Cross-lingual Text Classification with Heterogeneous Graph Neural Network [article]

Ziyun Wang, Xuan Liu, Peiji Yang, Shixing Liu, Zhisheng Wang
2021 arXiv   pre-print
Recent multilingual pretrained language models (mPLM) achieve impressive results in cross-lingual classification tasks, but rarely consider factors beyond semantic similarity, causing performance degradation  ...  Cross-lingual text classification aims at training a classifier on the source language and transferring the knowledge to target languages, which is very useful for low-resource languages.  ...  Learn to cross-lingual transfer with meta graph learning across heterogeneous languages.  ... 
arXiv:2105.11246v1 fatcat:xchnznsqwzb7tfdrwowgumueoq

Model Selection for Cross-Lingual Transfer [article]

Yang Chen, Alan Ritter
2021 arXiv   pre-print
We propose a machine learning approach to model selection that uses the fine-tuned model's own internal representations to predict its cross-lingual capabilities.  ...  Transformers that are pre-trained on multilingual corpora, such as, mBERT and XLM-RoBERTa, have achieved impressive cross-lingual transfer capabilities.  ...  Background: Cross-Lingual Transfer Learning The zero-shot setting considered in this paper works as follows. A Transformer model is first pretrained using a standard masked language model objective.  ... 
arXiv:2010.06127v2 fatcat:ua4z67ka2fczrbwod44i4vmopm

Improved Meta Learning for Low Resource Speech Recognition [article]

Satwinder Singh, Ruili Wang, Feng Hou
2022 arXiv   pre-print
We propose a new meta learning based framework for low resource speech recognition that improves the previous model agnostic meta learning (MAML) approach.  ...  Our proposed system outperforms MAML based low resource ASR system on various languages in terms of character error rates and stable training behavior.  ...  Recently, unsupervised cross-lingual wave2vec 2.0 XLSR model [21] shown a huge performance boost compared to other previous stateof-the-art models.  ... 
arXiv:2205.06182v1 fatcat:p2zfdihujjaajha3epuvf3qbie

How Do Multilingual Encoders Learn Cross-lingual Representation? [article]

Shijie Wu
2022 arXiv   pre-print
We also look at how to inject different cross-lingual signals into multilingual encoders, and the optimization behavior of cross-lingual transfer with these models.  ...  As different languages have different amounts of supervision, cross-lingual transfer benefits languages with little to no training data by transferring from other languages.  ...  Surprisingly, even without any explicit cross-lingual signal during pretraining, mBERT shows promising zero-shot cross-lingual performance-training the model on one language then directly applying that  ... 
arXiv:2207.05737v1 fatcat:j6vfurgdhvhm5evwaqjhf4b3lu

Sequential Reptile: Inter-Task Gradient Alignment for Multilingual Learning [article]

Seanie Lee, Hae Beom Lee, Juho Lee, Sung Ju Hwang
2022 arXiv   pre-print
Multilingual models jointly pretrained on multiple languages have achieved remarkable performance on various multilingual downstream tasks.  ...  We extensively validate our method on various multi-task learning and zero-shot cross-lingual transfer tasks, where our method largely outperforms all the relevant baselines we consider.  ...  Zero-shot Cross Lingual Transfer Zero-shot cross-lingual transfer is to train a model with monolingual labeled data and evaluate it on some unseen target languages without further finetuning the model  ... 
arXiv:2110.02600v3 fatcat:idg7cosqojcdvhdizacyla7lyy

Small and Practical BERT Models for Sequence Labeling [article]

Henry Tsai, Jason Riesa, Melvin Johnson, Naveen Arivazhagan, Xin Li, Amelia Archer
2019 arXiv   pre-print
We show that our model especially outperforms on low-resource languages, and works on codemixed input text without being explicitly trained on codemixed examples.  ...  We showcase the effectiveness of our method by reporting on part-of-speech tagging and morphological prediction on 70 treebanks and 48 languages.  ...  BERT Cross-Lingual Transfer While Wu and Dredze (2019) shows effective zero-shot crosslingual transfer from English to other high-resource languages, we show that cross-lingual transfer is even effective  ... 
arXiv:1909.00100v1 fatcat:mey3i3rhkvcifbvcftxatbxtkm

Multilingual Speech Recognition using Knowledge Transfer across Learning Processes [article]

Rimita Lahiri, Kenichi Kumatani, Eric Sun, Yao Qian
2021 arXiv   pre-print
Multilingual end-to-end(E2E) models have shown a great potential in the expansion of the language coverage in the realm of automatic speech recognition(ASR).  ...  In this paper, we aim to enhance the multilingual ASR performance in two ways, 1)studying the impact of feeding a one-hot vector identifying the language, 2)formulating the task with a meta-learning objective  ...  One of the major challenges for developing the multi-lingual ASR system is learning the cross-lingual representation for not only high-resource but also low-resource languages [16, 18, 20] .  ... 
arXiv:2110.07909v1 fatcat:rwn6tg7xvjbdhal25lnw3ldpzq

Co-attentional Transformers for Story-Based Video Understanding [article]

Björn Bebensee, Byoung-Tak Zhang
2020 arXiv   pre-print
Inspired by recent trends in vision and language learning, we explore applications of attention mechanisms for visio-lingual fusion within an application to story-based video understanding.  ...  Our model outperforms the baseline model by 8 percentage points overall, at least 4.95 and up to 12.8 percentage points on all difficulty levels and manages to beat the winner of the DramaQA challenge.  ...  To encode these language token sequences to obtain language representations we use a pretrained RoBERTa model [18] , a variant of the widely successful BERT model that achieves significantly better performance  ... 
arXiv:2010.14104v1 fatcat:tavkbs4hj5bzligkn3cjiinv4a
« Previous Showing results 1 — 15 out of 554 results