Discrete mixture HMM

S. Takahashi, K. Aikawa, S. Sagayama
1997 IEEE International Conference on Acoustics, Speech, and Signal Processing  
This paper proposes a new type of acoustic model called the discrete mixture HMM (DMHMM). As large scale speech databases have been constructed for speaker-independent HMMs, continuous mixture HMMs (CMHMMs) are needed to increase the number of mixture components in order to represent complex distributions. This leads to a high computational cost for calculating output probabilities. The DMHMM represents the feature parameter space by using the mixtures of multivariate distributions in the same
more » ... utions in the same way as the diagonal covariance CMHMM. Instead of using Gaussian mixtures to represent feature distributions in each dimension, the DMHMM uses the mixtures of the discrete distributions based on the scalar quantization (SQ). Since the discrete distribution has a higher degree-of-freedom in terms of representation, the DMHMM is advantageous in representing the feature distributions efficiently with fewer mixture components. In isolated word recognition experiments for telephone speech, we have found that the DMHMM outperformed the CMHMMs when those models had the same number of mixture components.
doi:10.1109/icassp.1997.596100 dblp:conf/icassp/TakahashiAS97 fatcat:bzlhg6cdfbf5rc4bo5ckq7j2mm