Bayesian Feature Enhancement for Reverberation and Noise Robust Speech Recognition

Volker Leutnant, Alexander Krueger, Reinhold Haeb-Umbach
2013 IEEE Transactions on Audio, Speech, and Language Processing  
In this contribution we extend a previously proposed BAYESIAN approach for the enhancement of reverberant logarithmic mel power spectral coefficients for robust automatic speech recognition to the additional compensation of background noise. A recently proposed observation model is employed whose time-variant observation error statistics are obtained as a side product of the inference of the a posteriori probability density function of the clean speech feature vectors. Further a reduction of
more » ... r a reduction of the computational effort and the memory requirements are achieved by using a recursive formulation of the observation model. The performance of the proposed algorithms is first experimentally studied on a connected digits recognition task with artificially created noisy reverberant data. It is shown that the use of the time-variant observation error model leads to a significant error rate reduction at low signal-to-noise ratios compared to a time-invariant model. Further experiments were conducted on a 5000 word task recorded in a reverberant and noisy environment. A significant word error rate reduction was obtained demonstrating the effectiveness of the approach on real-world data. Index Terms-Robust automatic speech recognition, modelbased BAYESIAN feature enhancement, observation model for reverberant and noisy speech, recursive observation model.
doi:10.1109/tasl.2013.2258013 fatcat:it3gguw7ibdlnbgedk4upoffcm