Experiments in adaptation of language models for commercial applications

Petra Witschel, Harald Höge
1997 5th European Conference on Speech Communication and Technology (Eurospeech 1997)   unpublished
To improve recognition accuracy for large vocabulary speech recognition systems we use language models based on linguistic classes (extended POS). In this paper an adaptation technique is presented, which profits from linguistic knowledge about unknown words of new domain. Switching from basis domain to new domain we keep the bigram probabilities of linguistic classes fixed and adapt only monograms of word probabilities. In our experiments we use three different corpora: financial columns of a
more » ... ewspaper corpus and two medical corpora (computer tomography and magnetic resonance). Adapted language models show an improvement of testset perplexity of 48% to 51% compared to the case of putting unknown words into the language model "unknown" class.
doi:10.21437/eurospeech.1997-522 fatcat:52intglguzgg7nawkclryxfava