A Robust Viterbi Algorithm Against Impulsive Noise With Application to Speech Recognition

M. Siu, A. Chan
2006 IEEE Transactions on Audio, Speech, and Language Processing  
DRAFT the "best" state sequence that is insensitive to a limited number of corruptions by focusing on finding the best path excluding k worse-performing observations. This is similar to the trimmed-means [16] or robust regression [17] in statistics in that the best path selected would be insensitive to up to k outliers. The labeling of the k worse-performing observations is path dependent. The advantage of the proposed joint approach is that the state dependent likelihoods, used in the process
more » ... f finding the best path, are also used for identifying the corruptions. This is particularly useful for identifying impulsive noise that can be confusable with some speech units. For example, if a noiselike frame is in the middle of a vowel segment, the match between this frame and the vowel model would be poor and this information can help label the noise-like frame to be an corrupted observation. A frame-by-frame noise detection may have difficulty in accurately identifying corrupted observations that can also occur in some speech sounds. Furthermore, the proposed approach explicitly controls the number of observations to drop, thus, preventing the possibility that in the extreme case, all of the observations are marked as noise-corrupted. Comparing our approach with the probabilistic union model [15] , the proposed approach has the advantage of not degrading performance for clean test data. Furthermore, it can be formulated within the Viterbi algorithm such that a full Viterbi search is possible. October 10, 2005 DRAFT
doi:10.1109/tasl.2006.872592 fatcat:rvkfupsifjepdcgfui4jovo2ju