A multimodal approach of generating 3D human-like talking agent

Minghao Yang, Jianhua Tao, Kaihui Mu, Ya Li, Jianfeng Che
2011 Journal on Multimodal User Interfaces  
This paper introduces a multimodal framework of generating a 3D human-like talking agent which can communicate with user through speech, lip movement, head motion, facial expression and body animation. In this framework, lip movements are obtained by searching and matching acoustic features which are represented by Mel-frequency cepstral coefficients (MFCC) in audio-visual bimodal database. Head motion is synthesized by visual prosody which maps textual prosodic features into rotational and
more » ... slational parameters. Facial expression and body animation are generated by transferring motion data to skeleton. A simplified high level Multimodal Marker Language (MML), in which only a few fields are used to coordinate the agent channels, is introduced to drive the agent. The experiments validate the effectiveness of the proposed multimodal framework.
doi:10.1007/s12193-011-0073-5 fatcat:34jcjjmdqbhwtkx3balxs5laci