Filters








495 Hits in 4.1 sec

Understanding and Improving Lexical Choice in Non-Autoregressive Translation [article]

Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, Dacheng Tao, Zhaopeng Tu
2021 arXiv   pre-print
Knowledge distillation (KD) is essential for training non-autoregressive translation (NAT) models by reducing the complexity of the raw data with an autoregressive teacher model.  ...  To this end, we introduce an extra Kullback-Leibler divergence term derived by comparing the lexical choice of NAT model and that embedded in the raw data.  ...  In recent years, there has been a growing interest in non-autoregressive translation (NAT, Gu et al., 2018) , which improves decoding efficiency by predicting all tokens independently and simultaneously  ... 
arXiv:2012.14583v2 fatcat:do6ox2ecufdazk7tlwhhwixzkq

Rejuvenating Low-Frequency Words: Making the Most of Parallel Data in Non-Autoregressive Translation [article]

Liang Ding, Longyue Wang, Xuebo Liu, Derek F. Wong, Dacheng Tao, Zhaopeng Tu
2022 arXiv   pre-print
Knowledge distillation (KD) is commonly used to construct synthetic data for training non-autoregressive translation (NAT) models.  ...  Results demonstrate that the proposed approach can significantly and universally improve translation quality by reducing translation errors on low-frequency words.  ...  Wong were supported in part by the Science and Technology Development Fund, Macau SAR (Grant No. 0101/2019/A2), and the Multi-year Research Grant from the University of Macau (Grant No.  ... 
arXiv:2106.00903v2 fatcat:r3i5cfihifhy5j4i5gey5la3cy

Neighbors Are Not Strangers: Improving Non-Autoregressive Translation under Low-Frequency Lexical Constraints [article]

Chun Zeng, Jiangjie Chen, Tianyi Zhuang, Rui Xu, Hao Yang, Ying Qin, Shimin Tao, Yanghua Xiao
2022 arXiv   pre-print
However, current autoregressive approaches suffer from high latency. In this paper, we focus on non-autoregressive translation (NAT) for this problem for its efficiency advantage.  ...  Experiments on the general and domain datasets show that our model improves over the backbone constrained NAT model in constraint preservation and translation quality, especially for rare constraints.  ...  Acknowledgement We would like to thank Xinyao Shen and Shineng Fang at Fudan University as well as Yimeng Chen at Huawei for the support in implementation.  ... 
arXiv:2204.13355v1 fatcat:nojzplepcbe7rhfdcc2gokpbsa

EDITOR: an Edit-Based Transformer with Repositioning for Neural Machine Translation with Soft Lexical Constraints [article]

Weijia Xu, Marine Carpuat
2021 arXiv   pre-print
We introduce an Edit-Based Transformer with Repositioning (EDITOR), which makes sequence generation flexible by seamlessly allowing users to specify preferences in output lexical choice.  ...  Building on recent models for non-autoregressive sequence generation (Gu et al., 2019), EDITOR generates new sequences by iteratively editing hypotheses.  ...  This research is supported in part by an Amazon Web Services Machine Learning Research Award and by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity  ... 
arXiv:2011.06868v2 fatcat:qlful5am7rfgfoavy6okneromm

EDITOR: An Edit-Based Transformer with Repositioning for Neural Machine Translation with Soft Lexical Constraints

Weijia Xu, Marine Carpuat
2021 Transactions of the Association for Computational Linguistics  
We introduce an Edit-Based TransfOrmer with Repositioning (EDITOR), which makes sequence generation flexible by seamlessly allowing users to specify preferences in output lexical choice.  ...  Building on recent models for non-autoregressive sequence generation (Gu et al., 2019), EDITOR generates new sequences by iteratively editing hypotheses.  ...  By contrast, autoregressive NMT models (Bahdanau et al., 2015; Vaswani et al., 2017) do not explicitly separate lexical choice and reordering, and previous non-autoregressive models break up reordering  ... 
doi:10.1162/tacl_a_00368 fatcat:xgtbmlvwpvhllaerctg6h4ao4a

Language Modeling, Lexical Translation, Reordering: The Training Process of NMT through the Lens of Classical SMT [article]

Elena Voita, Rico Sennrich, Ivan Titov
2021 arXiv   pre-print
Additionally, we explain how such an understanding of the training process can be useful in practice and, as an example, show how it can be used to improve vanilla non-autoregressive neural machine translation  ...  In this work, we look at the competences related to three core SMT components and find that during training, NMT first focuses on learning target-side language modeling, then improves translation quality  ...  Ivan Titov acknowledges support of the European Research Council (ERC StG BroadSem 678254), Dutch National Science Foundation (VIDI 639.022.518) and EU Horizon 2020 (GoURMET, no. 825299).  ... 
arXiv:2109.01396v1 fatcat:wqnh4d3q4zffvdp376iclb4tqm

A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond [article]

Yisheng Xiao, Lijun Wu, Junliang Guo, Juntao Li, Min Zhang, Tao Qin, Tie-yan Liu
2022 arXiv   pre-print
In this paper, we conduct a systematic survey with comparisons and discussions of various non-autoregressive translation (NAT) models from different aspects.  ...  Non-autoregressive (NAR) generation, which is first proposed in neural machine translation (NMT) to speed up inference, has attracted much attention in both machine learning and natural language processing  ...  Fully non-autoregressive decoding: Iter.1: We are proud of our performance , but we want to win every game .  ... 
arXiv:2204.09269v1 fatcat:vzyj7eypp5gynp4mkq6eqeqoti

Text Compression-aided Transformer Encoding [article]

Zuchao Li, Zhuosheng Zhang, Hai Zhao, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita
2021 IEEE Transactions on Software Engineering   pre-print
Our evaluation on benchmark datasets shows that the proposed explicit and implicit text compression approaches improve results in comparison to strong baselines.  ...  It has been done well by the self-attention mechanism in the current state-of-the-art Transformer encoder, which has brought about significant improvements in the performance of many NLP tasks.  ...  ACKNOWLEDGMENTS We thank Kevin Parnow, from the Department of Computer Science and Engineering, Shanghai Jiao Tong University (parnow@sjtu.edu.cn) for editing a draft of this manuscript.  ... 
doi:10.1109/tpami.2021.3058341 pmid:33577448 arXiv:2102.05951v1 fatcat:eveczqdvwfdcfcihc725lvvp54

Context-Aware Cross-Attention for Non-Autoregressive Translation [article]

Liang Ding, Longyue Wang, Di Wu, Dacheng Tao, Zhaopeng Tu
2020 arXiv   pre-print
Non-autoregressive translation (NAT) significantly accelerates the inference process by predicting the entire target sequence.  ...  Experimental results on several representative datasets show that our approach can consistently improve translation quality over strong NAT baselines.  ...  We are grateful to the anonymous reviewers and the area chair for their insightful comments and suggestions.  ... 
arXiv:2011.00770v1 fatcat:xjswvkydxnfbbmusziwlnvvnqe

Cognitive precursors of the developmental relation between lexical quality and reading comprehension in the intermediate elementary grades

Nicole M. Swart, Marloes M.L. Muijselaar, Esther G. Steenbeek-Planting, Mienke Droop, Peter F. de Jong, Ludo Verhoeven
2017 Learning and Individual Differences  
Cognitive precursors of the developmental relation between lexical quality and reading comprehension in the intermediate elementary grades.  ...  Research examining these differences in structural relations is warranted in order to improve our understanding of developmental differences and consequently improve education.  ...  In classroom settings, children should be stimulated to improve their vocabulary knowledge in order to improve their reading comprehension skills and vice versa.  ... 
doi:10.1016/j.lindif.2017.08.009 fatcat:oadnhzktwjbk3ftxkelulhuite

Improved Variational Neural Machine Translation by Promoting Mutual Information [article]

Arya D. McCarthy and Xian Li and Jiatao Gu and Ning Dong
2019 arXiv   pre-print
As a result, the proposed model yields improved translation quality while demonstrating superior performance in terms of data efficiency and robustness.  ...  In this work, we address this problem in variational neural machine translation by explicitly promoting mutual information between the latent variables and the data.  ...  Disfluent words or absences are marked in red, and slightly incorrect lexical choice is marked in blue. Romanian diacritics have been stripped.Source: ma intristeaza foarte tare .  ... 
arXiv:1909.09237v1 fatcat:dhnqus7sorei5bpfngdtesvwxi

Extended Study on Using Pretrained Language Models and YiSi-1 for Machine Translation Evaluation

Chi-kiu Lo
2020 Conference on Machine Translation  
Although the recently proposed contextual embedding based metrics, YiSi-1, significantly outperform BLEU and other metrics in correlating with human judgment on translation quality, we have yet to understand  ...  We present an extended study on using pretrained language models and YiSi-1 for machine translation evaluation.  ...  CamemBERT for French, BERT Japanese and Chinese) would be a better model choice.  ... 
dblp:conf/wmt/Lo20 fatcat:hcp736bstfft3bpihdq66xuql4

Neural machine translation: A review of methods, resources, and tools

Zhixing Tan, Shuo Wang, Zonghan Yang, Gang Chen, Xuancheng Huang, Maosong Sun, Yang Liu
2020 AI Open  
In recent years, end-to-end neural machine translation (NMT) has achieved great success and has become the new mainstream method in practical MT systems.  ...  In this article, we first provide a broad review of the methods for NMT and focus on methods relating to architectures, decoding, and data augmentation.  ...  Program of China (No. 2017YFB0 202204), National Natural Science Foundation of China (No. 61925601, No. 61761166 008, No. 61772302), Beijing Academy of Artificial Intelligence, Huawei Noah's Ark Lab, and  ... 
doi:10.1016/j.aiopen.2020.11.001 fatcat:wkplwv43knb3lebicckmwbxlwu

DRAG: Director-Generator Language Modelling Framework for Non-Parallel Author Stylized Rewriting [article]

Hrituraj Singh, Gaurav Verma, Aparna Garimella, Balaji Vasan Srinivasan
2021 arXiv   pre-print
Our quantitative and qualitative analyses further show that our model has better meaning retention and results in more fluent generations.  ...  Our experiments on corpora consisting of relatively small-sized text authored by three distinct authors show significant improvements upon existing works to rewrite input texts in target author's style  ...  We also note that while the approach proposed in STYLELM (Syed et al., 2020) improves lexical scores significantly, it fails to bring the same level of improvement in surface and syntactic alignments  ... 
arXiv:2101.11836v1 fatcat:c2h3g32jnrfn7pcpu5dw3lhel4

Integrating Vectorized Lexical Constraints for Neural Machine Translation [article]

Shuo Wang, Zhixing Tan, Yang Liu
2022 arXiv   pre-print
Lexically constrained neural machine translation (NMT), which controls the generation of NMT models with pre-specified constraints, is important in many practical scenarios.  ...  Due to the representation gap between discrete constraints and continuous vectors in NMT models, most existing works choose to construct synthetic data or modify the decoding algorithm to impose lexical  ...  We sincerely thank Guanhua Chen and Chi Chen for their constructive advice on technical details, and all the reviewers for their valuable and insightful comments.  ... 
arXiv:2203.12210v1 fatcat:b4fgrupwyfchrauiqdidu4c26a
« Previous Showing results 1 — 15 out of 495 results