Field test evaluations and optimization of speaker independent speech recognition for telephone applications

Christian Gagnoulet, Christel Sorin
1991 Proceedings of the workshop on Speech and Natural Language - HLT '91   unpublished
This paper presents, in a first part, the detailed results of several field evaluations of the CNET speaker independent speech recognition system in a context of 2 voice-activatedservers accessible by the general French public over the telephone. The analysis of roughly ll 000 user's tokens indicates that the rejection of incorrect input is a major problem and that the gap between the recognition rates observed in real use conditions and in the most "realistic" laboratory tests remains very
more » ... ts remains very large. The second part of the paper describes the current improvements of the system : better rejection procedures, enhancement of the recognition performances resulting from both the introduction of field data in the training data and the increase of the number of parameters, automatic adjustments of the HMM topology allowing to either reduce overall model complexity or improve recognition performance. Tested on long distance telep.hone databases (450 to 750 speakers), the current version of the CNET recognition system yields a laboratory error rate of 0.7 % on the 10 French digits and of 0.95 % on a 36 word vocabulary.
doi:10.3115/112405.112429 fatcat:hsgb3e3zfjeazefp5l7wcbtara