Comparison of text-independent speaker recognition methods using VQ-distortion and discrete/continuous HMM's

T. Matsui, S. Furui
1994 IEEE Transactions on Speech and Audio Processing  
This paper compares a VQ (vector quantization)distortion-based speaker recognition method and discrete/continuous ergodic HMM (hidden Markov model)based ones, especially from the viewpoint of robustness against utterance variations. We show that a continuous ergodic HMM is as robust as a VQ-distortion method when enough data is available and that a continuous ergodic HMM is far superior to a discrete ergodic HMM. We also show that the information on transitions between different states is
more » ... ctive for text-independent speaker recognition. Therefore, the speaker identification rates using a continuous ergodic HMM are strongly correlated with the total number of mixtures irrespective of the number of states. It is also found that, for continuous ergodic HMMbased speaker recognition, the Distortion-Intersection Measure (DIM), which was introduced as a VQ-distortion measure to increase the robustness against utterance variations, is effective. 11-159 11-160
doi:10.1109/89.294363 fatcat:eaiv3gaslfea5lxz3w6tjjpqjq