Incremental Training and Intentional Over-fitting of Word Alignment

Qin Gao, William Lewis, Chris Quirk, Mei-Yuh Hwang
2011 Machine Translation Summit  
We investigate two problems in word alignment for machine translation. First, we compare methods for incremental word alignment to save time for large-scale machine translation systems. Various methods of using existing word alignment models trained on a larger, general corpus for incrementally aligning smaller new corpora are compared. In addition, by training separate translation tables, we eliminate the need for any re-processing of the baseline data. Experimental results are comparable or
more » ... en superior to the baseline batch-mode training. Based on this success, we explore the possibility of sharpening alignment model via incremental training scheme. By first training a general word alignment model on the whole corpus and then dividing the same corpus into domainspecific partitions, followed by applying incremental training to each partition, we can improve machine translation quality as measured by BLEU.
dblp:conf/mtsummit/GaoLQH11 fatcat:v2bcwkwz7rc7vfc2jssaejmwvq