Understanding Neural Machine Translation by Simplification: The Case of Encoder-free Models

Gongbo Tang, Department of Linguistics and Philology, Uppsala University, Sweden, Rico Sennrich, Joakim Nivre, School of Informatics, University of Edinburgh Edinburgh, UK, Institute of Computational Linguistics, University of Zurich, Switzerland, Department of Linguistics and Philology, Uppsala University, Sweden
2019 Proceedings - Natural Language Processing in a Deep Learning World  
In this paper, we try to understand neural machine translation (NMT) via simplifying NMT architectures and training encoder-free NMT models. In an encoderfree model, the sums of word embeddings and positional embeddings represent the source. The decoder is a standard Transformer or recurrent neural network that directly attends to embeddings via attention mechanisms. Experimental results show (1) that the attention mechanism in encoder-free models acts as a strong feature extractor, (2) that
more » ... ractor, (2) that the word embeddings in encoder-free models are competitive to those in conventional models, (3) that non-contextualized source representations lead to a big performance drop, and (4) that encoder-free models have different effects on alignment quality for German→English and Chinese→English.
doi:10.26615/978-954-452-056-4_136 dblp:conf/ranlp/TangSN19 fatcat:5wap5ztelrbdvb7jxjobhbnw5q