A fast maximum likelihood nonlinear feature transformation method for GMM–HMM speaker adaptation

Kaisheng Yao, Dong Yu, Li Deng, Yifan Gong
2014 Neurocomputing  
We describe a novel maximum likelihood nonlinear feature bias compensation method for Gaussian mixture model-hidden Markov model (GMM-HMM) adaptation. Our approach exploits a single-hiddenlayer neural network (SHLNN) that, similar to the extreme learning machine (ELM), uses randomly generated lower-layer weights and linear output units. Different from the conventional ELM, however, our approach optimizes the SHLNN parameters by maximizing the likelihood of observing the features given the
more » ... r-independent GMM-HMM. We derive a novel and efficient learning algorithm for optimizing this criterion. We show, on a large vocabulary speech recognition task, that the proposed approach can cut the word error rate (WER) by 13% over the feature maximum likelihood linear regression (fMLLR) method with bias compensation, and can cut the WER by more than 5% over the fMLLR method with both bias and rotation transformations if applied on top of the fMLLR. Overall, it can reduce the WER by more than 27% over the speaker-independent system with 0.2 real-time adaptation time. Please cite this article as: K. Yao, et al., A fast maximum likelihood nonlinear feature transformation method for GMM-HMM speaker adaptation, Neurocomputing (2013), http://dx.
doi:10.1016/j.neucom.2013.02.050 fatcat:szitosp27nfjhdpcusewymkaou