Improving statistical machine translation in the medical domain using the unified medical language system

Matthias Eck, Stephan Vogel, Alex Waibel
2004 Proceedings of the 20th international conference on Computational Linguistics - COLING '04   unpublished
Texts from the medical domain are an important task for natural language processing. This paper investigates the usefulness of a large medical database (the Unified Medical Language System) for the translation of dialogues between doctors and patients using a statistical machine translation system. We are able to show that the extraction of a large dictionary and the usage of semantic type information to generalize the training data significantly improves the translation performance.
doi:10.3115/1220355.1220469 fatcat:cn4zruk3hzgijovssm7svpno3a