A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2010; you can also visit the original URL.
The file type is application/pdf
.
Improved word alignment with statistics and linguistic heuristics
2009
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing Volume 1 - EMNLP '09
unpublished
We present a method to align words in a bitext that combines elements of a traditional statistical approach with linguistic knowledge. We demonstrate this approach for Arabic-English, using an alignment lexicon produced by a statistical word aligner, as well as linguistic resources ranging from an English parser to heuristic alignment rules for function words. These linguistic heuristics have been generalized from a development corpus of 100 parallel sentences. Our aligner, UALIGN, outperforms
doi:10.3115/1699510.1699540
fatcat:ngocj46bdzbhzdhf2gb6jhmgbm