Speech-based Emotion Recognition and Speaker Identification: Static vs. Dynamic Mode of Speech Representation

Maxim Sidorov, Wolfgang Minkery, Eugene S. Semenkin
2016 Journal of Siberian Federal University Mathematics & Physics  
In this paper we present the performance of different machine learning algorithms for the problems of speech-based Emotion Recognition (ER) and Speaker Identification (SI) in static and dynamic modes of speech signal representation. We have used a multi-corporal, multi-language approach in the study. 3 databases for the problem of SI and 4 databases for the ER task of 3 different languages (German, English and Japanese) have been used in our study to evaluate the models. More than 45 machine
more » ... rning algorithms were applied to these tasks in both modes and the results alongside discussion are presented here. Keywords: emotion recognition from speech, speaker identification from speech, machine learning algorithms, speaker adaptive emotion recognition from speech.
doi:10.17516/1997-1397-2016-9-4-518-523 fatcat:mj7jny64v5fzjcsvl7ec3dhn3m