Bilingual terminology acquisition from comparable corpora and phrasal translation to cross-language information retrieval

Fatiha Sadat, Masatoshi Yoshikawa, Shunsuke Uemura
2003 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - ACL '03  
The present paper will seek to present an approach to bilingual lexicon extraction from non-aligned comparable corpora, phrasal translation as well as evaluations on Cross-Language Information Retrieval. A two-stages translation model is proposed for the acquisition of bilingual terminology from comparable corpora, disambiguation and selection of best translation alternatives according to their linguistics-based knowledge. Different rescoring techniques are proposed and evaluated in order to
more » ... ect best phrasal translation alternatives. Results demonstrate that the proposed translation model yields better translations and retrieval effectiveness could be achieved across Japanese-English language pair.
doi:10.3115/1075178.1075201 dblp:conf/acl/SadatYU03 fatcat:3ffo3vcownhq7kb2qw3x22e5r4