Source-side Preordering for Translation using Logistic Regression and Depth-first Branch-and-Bound Search
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
We present a simple preordering approach for machine translation based on a featurerich logistic regression model to predict whether two children of the same node in the source-side parse tree should be swapped or not. Given the pair-wise children regression scores we conduct an efficient depth-first branch-and-bound search through the space of possible children permutations, avoiding using a cascade of classifiers or limiting the list of possible ordering outcomes. We report experiments in
... experiments in translating English to Japanese and Korean, demonstrating superior performance as (a) the number of crossing links drops by more than 10% absolute with respect to other state-of-the-art preordering approaches, (b) BLEU scores improve on 2.2 points over the baseline with lexicalised reordering model, and (c) decoding can be carried out 80 times faster. * This work was done during an internship of the first author at SDL Research, Cambridge.