Exploiting Multilingualism through Multistage Fine-Tuning for Low-Resource Neural Machine Translation

Raj Dabre, Atsushi Fujita, Chenhui Chu
2019 Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)  
This paper highlights the impressive utility of multi-parallel corpora for transfer learning in a one-to-many low-resource neural machine translation (NMT) setting. We report on a systematic comparison of multistage finetuning configurations, consisting of (1) pretraining on an external large (209k-440k) parallel corpus for English and a helping target language, (2) mixed pre-training or fine-tuning on a mixture of the external and low-resource (18k) target parallel corpora, and (3) pure
more » ... ing on the target parallel corpora. Our experiments confirm that multi-parallel corpora are extremely useful despite their scarcity and content-wise redundancy thus exhibiting the true power of multilingualism. Even when the helping target language is not one of the target languages of our concern, our multistage finetuning can give 3-9 BLEU score gains over a simple one-to-one model.
doi:10.18653/v1/d19-1146 dblp:conf/emnlp/DabreFC19 fatcat:xwdq2gdw7fhyfm4ze7medtzftm