Rescoring effectiveness of language models using different levels of knowledge and their integration

Wen Wang, Yang Liu, Harper
2002 IEEE International Conference on Acoustics Speech and Signal Processing  
In this paper, we compare the e cacy of a variety of language models (LMs) for rescoring word graphs and N-best lists generated by a l a r g e v ocabulary continuous speech recognizer. These LMs di er based on the level of knowledge used (word, lexical features, syntax) and the type of integration of that knowledge (tight or loose). The trigram LM incorporates word level information our Part-of-Speech (POS) LM uses word and lexical class information in a tightly coupled way our new SuperARV LM
more » ... ightly integrates word, a richer set of lexical features than POS, and syntactic dependency information and the Parser LM integrates some limited word information, POS, and syntactic information. We also investigate LMs created using a linear interpolation of LM pairs. When comparing each LM on the task of rescoring word graphs or N-best lists for the Wall Street Journal (WSJ) 5k-and 20k-vocabulary test sets, the SuperARV L M a l w ays achieves the greatest reduction in word error rate (WER) and the greatest increase in sentence accuracy (SAC). On the 5k test sets, the SuperARV LM obtains more than a 10% relative reduction in WER compared to the trigram LM, and on the 20k test set more than 2%. Additionally, the SuperARV L M p e rforms comparably to or better than the interpolated LMs. Hence, we conclude that the tight coupling of knowledge from all three levels is an e ective method of constructing high quality LMs. I -785 0-7803-7402-9/02/$17.00
doi:10.1109/icassp.2002.1005857 fatcat:ajfjifmnijfgtny3iublfpxyam