Reconstruction-error-based learning for continuous emotion recognition in speech

Jing Han, Zixing Zhang, Fabien Ringeval, Bjorn Schuller
2017 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
Reconstruction-error-based learning for continuous emotion recognition in speech. ABSTRACT To advance the performance of continuous emotion recognition from speech, we introduce a reconstruction-error-based (RE-based) learning framework with memory-enhanced Recurrent Neural Networks (RNN). In the framework, two successive RNN models are adopted, where the first model is used as an autoencoder for reconstructing the original features, and the second is employed to perform emotion prediction. The
more » ... RE of the original features is used as a complementary descriptor, which is merged with the original features and fed to the second model. The assumption of this framework is that the system has the ability to learn its 'drawback' which is expressed by the RE. Experimental results on the RECOLA database show that the proposed framework significantly outperforms the baseline systems without any RE information in terms of Concordance Correlation Coefficient (.729 vs .710 for arousal, .360 vs .237 for valence), and also significantly overcomes other state-of-the-art methods.
doi:10.1109/icassp.2017.7952580 dblp:conf/icassp/HanZRS17 fatcat:gj7sze6b5re2rkahvttklcw4am