Confidence scoring based on backward language models

Duchateau, Demuynck, Wambacq
2002 IEEE International Conference on Acoustics Speech and Signal Processing  
In this paper we introduce the backward N-gram language model (LM) scores as a confidence measure in large vocabulary continuous speech recognition. Contrary to a forward N-gram LM, in which the probability of a word is dependent on the preceding words, a word in a backward N-gram LM is predicted based on the following words only. So the backward LM is a model for sentences read from the end to the beginning. We show on the benchmark 20k word Wall Street Journal recognition task that the
more » ... ask that the backward LM scores contain information for the confidence measure that is complementary to the information in forward LM scores. The normalised cross entropy metric for confidence measures increases significantly from 18.5% to 23.3% when backward LM scores are added to a confidence measure which includes the forward LM scores.
doi:10.1109/icassp.2002.1005716 fatcat:ekksuxmbunavleb7csfghfbqiq