Multi-stream Confidence Analysis for Audio-Visual Affect Recognition [chapter]

Zhihong Zeng, Jilin Tu, Ming Liu, Thomas S. Huang
2005 Lecture Notes in Computer Science  
Changes in a speaker's emotion are a fundamental component in human communication. Some emotions motivate human actions while others add deeper meaning and richness to human interactions. In this paper, we explore the development of a computing algorithm that uses audio and visual sensors to recognize a speaker's affective state. Within the framework of Multistream Hidden Markov Model (MHMM), we analyze audio and visual observations to detect 11 cognitive/emotive states. We investigate the use
more » ... f individual modality confidence measures as a means of estimating weights when combining likelihoods in the audio-visual decision fusion. Person-independent experimental results from 20 subjects in 660 sequences suggest that the use of stream exponents estimated on training data results in classification accuracy improvement of audio-visual affect recognition.
doi:10.1007/11573548_123 fatcat:se3zhvwbqrcbbega2qhudnk55q