Random discriminant structure analysis for automatic recognition of connected vowels

Yu Qiao, Satoshi Asakawa, Nobuaki Minematsu
2007 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)  
The universal structure of speech [1, 2] , proves to be invariant to transformations in feature space, and thus provides a robust representation for speech recognition. One of the difficulties of using structure representation is due to its high dimensionality. This not only increases computational cost but also easily suffers from the curse of dimensionality [3, 4] . In this paper, we introduce Random Discriminant Structure Analysis (RDSA) to deal with this problem. Based on the observation
more » ... t structural features are highly correlated and include large redundancy, the RDSA combines random feature selection and discriminative analysis to calculate several low dimensional and discriminative representations from an input structure. Then an individual classifier is trained for each representation and the outputs of each classifier are integrated for the final classification decision. Experimental results on connected Japanese vowel utterances show that our approach achieves a recognition rate of 98.3% based on the training data of 8 speakers, which is higher than that (97.4%) of HMMs trained with the utterances of 4,130 speakers.
doi:10.1109/asru.2007.4430176 dblp:conf/asru/QiaoAM07 fatcat:y4a47jf6zvhaxl46ylrlk7bw24