Constraining the Transformer NMT Model with Heuristic Grid Beam Search

Guodong Xie, Andy Way
2020 Conference of the Association for Machine Translation in the Americas  
Constrained decoding forces a certain set words or phrases to appear in the translation results and is very useful when adapting MT to a certain domain. In recent years, the Transformer model has outperformed other neural machine translation models to become the state-of-theart paradigm. However, constrained decoding for domain adaptation remains an open problem under the Transformer model. In this paper, we first investigate how a constrained decoding method -Grid Beam Search (GBS) -performs
more » ... the Transformer model, and then propose a source-informed heuristic method that can fully take advantage of the alignment information from the multi-head attention mechanism in Transformer to speed up the decoding in the GBS method and guide the placement of constraints during the expansion of hypotheses in GBS. Experiments on English-Chinese and English-German translation domain adaptation tasks show that the proposed method significantly outperforms the basic Transformer model in terms of BLEU and METEOR score, and prunes up to 30% hypotheses to save up to 20% decoding time compared to the GBS model while maintaining comparable translation performance.
dblp:conf/amta/XieW20 fatcat:k62dlx5uh5dzfahw2lkmhinejm