A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Unsupervised Vocabulary Adaptation for Morph-based Language Models
2012
North American Chapter of the Association for Computational Linguistics
Modeling of foreign entity names is an important unsolved problem in morpheme-based modeling that is common in morphologically rich languages. In this paper we present an unsupervised vocabulary adaptation method for morph-based speech recognition. Foreign word candidates are detected automatically from in-domain text through the use of letter n-gram perplexity. Over-segmented foreign entity names are restored to their base forms in the morph-segmented in-domain text for easier and more
dblp:conf/naacl/MansikkaniemiK12
fatcat:7yeb5bpttzfk3dfxn6gywodqxi