A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Cross-lingual Adaption Model-Agnostic Meta-Learning for Natural Language Understanding
[article]
2021
arXiv
pre-print
However, previous studies sample the meta-training and meta-testing data from the same language, which limits the ability of the model for cross-lingual transfer. ...
Meta learning with auxiliary languages has demonstrated promising improvements for cross-lingual natural language processing. ...
size β, θ ← θ − β∇ θ τ i ∼p(τ ) L (1) τ i (f θ−α∇ θ L (0) τ i (f θ ) ) (6) 3 Cross-lingual Adaption Model-Agnostic Meta-Learning
Meta Task Formulation Cross-lingual Natural Language Understanding aims ...
arXiv:2111.05805v1
fatcat:rr4ff4w24fdo5lyxkwsspx5lum
On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment
[article]
2020
arXiv
pre-print
Motivated by these observations, we also present a meta-learning algorithm that obtains better cross-lingual transferability and alleviates negative interference, by adding language-specific layers as ...
meta-parameters and training them in a manner that explicitly improves shared layers' generalization on all languages. ...
Multilingual language models such as mBERT (Devlin et al., 2018) and XLM (Lample and Conneau, 2019) have been proven effective for cross-lingual transfer learning by pretraining a single shared Transformer ...
arXiv:2010.03017v1
fatcat:d6c5lm4lcvawll6fexjcha2mne
Combining Pretrained High-Resource Embeddings and Subword Representations for Low-Resource Languages
[article]
2020
arXiv
pre-print
In our exploration, we show that a meta-embedding approach combining both pretrained and morphologically-informed word embeddings performs best in the downstream task of Xhosa-English translation. ...
To help circumvent this issue, we explore techniques exploiting the qualities of morphologically rich languages (MRLs), while leveraging pretrained word vectors in well-resourced languages. ...
Learning of cross-lingual word embeddings (Ruder, 2017) and projecting monolingual word embeddings to a single cross-lingual embedding space (Artetxe et al., 2018) have also been proposed to help learn ...
arXiv:2003.04419v3
fatcat:u32hmrgyfrevtjver5cjikfphu
From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers
[article]
2020
arXiv
pre-print
Massively multilingual transformers pretrained with language modeling objectives (e.g., mBERT, XLM-R) have become a de facto default transfer paradigm for zero-shot cross-lingual transfer in NLP, offering ...
In this work, we analyze their limitations and show that cross-lingual transfer via massively multilingual transformers, much like transfer via cross-lingual word embeddings, is substantially less effective ...
observe the following phenomenon working with XLM-R: for a fixed model capacity, the cross-lingual transfer performance improves when adding more pretraining languages only up to a certain point. ...
arXiv:2005.00633v1
fatcat:hwaf2biiszhwbf5cwuyhfvycvi
Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data
[article]
2021
arXiv
pre-print
Recently, sequence-to-sequence (seq-to-seq) models have been successfully applied in text-to-speech (TTS) to synthesize speech for single-language text. ...
To synthesize speech for multiple languages usually requires multi-lingual speech from the target speaker. ...
[18] introduces meta learning to improve multi-lingual TTS based on [17] . ...
arXiv:2110.07210v1
fatcat:jqtfnrqzebdupmflrsey7iyuse
Small and Practical BERT Models for Sequence Labeling
2019
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
We show that our model especially outperforms on low-resource languages, and works on codemixed input text without being explicitly trained on codemixed examples. ...
We showcase the effectiveness of our method by reporting on part-of-speech tagging and morphological prediction on 70 treebanks and 48 languages. ...
BERT Cross-Lingual Transfer While Wu and Dredze (2019) shows effective zero-shot crosslingual transfer from English to other high-resource languages, we show that cross-lingual transfer is even effective ...
doi:10.18653/v1/d19-1374
dblp:conf/emnlp/TsaiRJALA19
fatcat:5azv4mjhkng5dm6iuqmqlwsc2m
Know Where You're Going: Meta-Learning for Parameter-Efficient Fine-tuning
[article]
2022
arXiv
pre-print
language model frozen. ...
in gains of up to 1.7 points on cross-lingual NER fine-tuning. ...
As a testbed, we experiment with cross-lingual NER. ...
arXiv:2205.12453v1
fatcat:jqdrcaonireyfj6z5irxvv6hva
Cross-lingual Text Classification with Heterogeneous Graph Neural Network
[article]
2021
arXiv
pre-print
Recent multilingual pretrained language models (mPLM) achieve impressive results in cross-lingual classification tasks, but rarely consider factors beyond semantic similarity, causing performance degradation ...
Cross-lingual text classification aims at training a classifier on the source language and transferring the knowledge to target languages, which is very useful for low-resource languages. ...
Learn to cross-lingual transfer with meta graph learning across heterogeneous languages. ...
arXiv:2105.11246v1
fatcat:xchnznsqwzb7tfdrwowgumueoq
Model Selection for Cross-Lingual Transfer
[article]
2021
arXiv
pre-print
We propose a machine learning approach to model selection that uses the fine-tuned model's own internal representations to predict its cross-lingual capabilities. ...
Transformers that are pre-trained on multilingual corpora, such as, mBERT and XLM-RoBERTa, have achieved impressive cross-lingual transfer capabilities. ...
Background: Cross-Lingual Transfer Learning The zero-shot setting considered in this paper works as follows. A Transformer model is first pretrained using a standard masked language model objective. ...
arXiv:2010.06127v2
fatcat:ua4z67ka2fczrbwod44i4vmopm
Improved Meta Learning for Low Resource Speech Recognition
[article]
2022
arXiv
pre-print
We propose a new meta learning based framework for low resource speech recognition that improves the previous model agnostic meta learning (MAML) approach. ...
Our proposed system outperforms MAML based low resource ASR system on various languages in terms of character error rates and stable training behavior. ...
Recently, unsupervised cross-lingual wave2vec 2.0 XLSR model [21] shown a huge performance boost compared to other previous stateof-the-art models. ...
arXiv:2205.06182v1
fatcat:p2zfdihujjaajha3epuvf3qbie
How Do Multilingual Encoders Learn Cross-lingual Representation?
[article]
2022
arXiv
pre-print
We also look at how to inject different cross-lingual signals into multilingual encoders, and the optimization behavior of cross-lingual transfer with these models. ...
As different languages have different amounts of supervision, cross-lingual transfer benefits languages with little to no training data by transferring from other languages. ...
Surprisingly, even without any explicit cross-lingual signal during pretraining, mBERT shows promising zero-shot cross-lingual performance-training the model on one language then directly applying that ...
arXiv:2207.05737v1
fatcat:j6vfurgdhvhm5evwaqjhf4b3lu
Sequential Reptile: Inter-Task Gradient Alignment for Multilingual Learning
[article]
2022
arXiv
pre-print
Multilingual models jointly pretrained on multiple languages have achieved remarkable performance on various multilingual downstream tasks. ...
We extensively validate our method on various multi-task learning and zero-shot cross-lingual transfer tasks, where our method largely outperforms all the relevant baselines we consider. ...
Zero-shot Cross Lingual Transfer Zero-shot cross-lingual transfer is to train a model with monolingual labeled data and evaluate it on some unseen target languages without further finetuning the model ...
arXiv:2110.02600v3
fatcat:idg7cosqojcdvhdizacyla7lyy
Small and Practical BERT Models for Sequence Labeling
[article]
2019
arXiv
pre-print
We show that our model especially outperforms on low-resource languages, and works on codemixed input text without being explicitly trained on codemixed examples. ...
We showcase the effectiveness of our method by reporting on part-of-speech tagging and morphological prediction on 70 treebanks and 48 languages. ...
BERT Cross-Lingual Transfer While Wu and Dredze (2019) shows effective zero-shot crosslingual transfer from English to other high-resource languages, we show that cross-lingual transfer is even effective ...
arXiv:1909.00100v1
fatcat:mey3i3rhkvcifbvcftxatbxtkm
Multilingual Speech Recognition using Knowledge Transfer across Learning Processes
[article]
2021
arXiv
pre-print
Multilingual end-to-end(E2E) models have shown a great potential in the expansion of the language coverage in the realm of automatic speech recognition(ASR). ...
In this paper, we aim to enhance the multilingual ASR performance in two ways, 1)studying the impact of feeding a one-hot vector identifying the language, 2)formulating the task with a meta-learning objective ...
One of the major challenges for developing the multi-lingual ASR system is learning the cross-lingual representation for not only high-resource but also low-resource languages [16, 18, 20] . ...
arXiv:2110.07909v1
fatcat:rwn6tg7xvjbdhal25lnw3ldpzq
Co-attentional Transformers for Story-Based Video Understanding
[article]
2020
arXiv
pre-print
Inspired by recent trends in vision and language learning, we explore applications of attention mechanisms for visio-lingual fusion within an application to story-based video understanding. ...
Our model outperforms the baseline model by 8 percentage points overall, at least 4.95 and up to 12.8 percentage points on all difficulty levels and manages to beat the winner of the DramaQA challenge. ...
To encode these language token sequences to obtain language representations we use a pretrained RoBERTa model [18] , a variant of the widely successful BERT model that achieves significantly better performance ...
arXiv:2010.14104v1
fatcat:tavkbs4hj5bzligkn3cjiinv4a
« Previous
Showing results 1 — 15 out of 554 results