Feature Optimization of Speech Emotion Recognition

Chunxia Yu, Ling Xie, Weiping Hu
2016 Journal of Biomedical Science and Engineering  
Speech emotion is divided into four categories, Fear, Happy, Neutral and Surprise in this paper. Traditional features and their statistics are generally applied to recognize speech emotion. In order to quantify each feature's contribution to emotion recognition, a method based on the Back Propagation (BP) neural network is adopted. Then we can obtain the optimal subset of the features. What's more, two new characteristics of speech emotion, MFCC feature extracted from the fundamental frequency
more » ... urve (MFCCF0) and amplitude perturbation parameters extracted from the shorttime average magnitude curve (APSAM), are added to the selected features. With the Gaussian Mixture Model (GMM), we get the highest average recognition rate of the four emotions 82.25%, and the recognition rate of Neutral 90%.
doi:10.4236/jbise.2016.910b005 fatcat:zstjq5uudffxnfssuotksgv4ty