Multi-Task Minimum Error Rate Training for SMT

Patrick Simianer, Katharina Wäschle, Stefan Riezler
2011 Prague Bulletin of Mathematical Linguistics  
Multi-Task Minimum Error Rate Training for SMT We present experiments on multi-task learning for discriminative training in statistical machine translation (SMT), extending standard minimum-error-rate training (MERT) by techniques that take advantage of the similarity of related tasks. We apply our techniques to German-to-English translation of patents from 8 tasks according to the International Patent Classification (IPC) system. Our experiments show statistically significant gains over
more » ... t gains over task-specific training by techniques that model commonalities through shared parameters. However, more finegrained combinations of shared parameters with task-specific ones could not be brought to bear on models with a small number of dense features. The software used in the experiments is released as open-source tool.
doi:10.2478/v10108-011-0015-0 fatcat:eepkepl6lnhkld2bizwytesrn4