A dynamic in-search data selection method with its applications to acoustic modeling and utterance verification
IEEE Transactions on Speech and Audio Processing
In this paper, we propose a dynamic in-search data selection method to diagnose competing information automatically from speech data. In our method, the Viterbi beam search is used to decode all training data. During decoding, all partial paths within the beam are examined to identify the so-called competing-token and true-token sets for each individual hidden Markov model (HMM). In this work, the collected data tokens are used for acoustic modeling and utterance verification as two specific
... mples. In acoustic modeling, the true-token sets are used to adapt HMMs with a sequential maximum a posteriori adaptation method, while a generalized probabilistic descent-based discriminative training method is proposed to improve HMMs based on competing-token sets. In utterance verification, under the framework of likelihood ratio testing, the true-token sets are employed to train positive models for the null hypothesis and the competing-token sets are used to estimate negative models for the alternative hypothesis. All the proposed methods are evaluated in Bell Laboratories communicator system. Experimental results show that the new acoustic modeling method can consistently improve recognition performance over our best maximum likelihood estimation models, roughly 1% absolute reduction in word error rate. The results also show the new verification models can significantly improve the performance of utterance verification over the conventional anti models, almost relatively 30% reduction of equal error rate when identifying misrecognized words from the recognition results. Index Terms-Competing token, discriminative training, in-search data selection, log likelihood ratio (LLR) testing, sequential maximum a posteriori (MAP) adaptation, true token (TT).