An architecture for fluid real-time conversational agents: integrating incremental output generation and input processing

Stefan Kopp, Herwin van Welbergen, Ramin Yaghoubzadeh, Hendrik Buschmeier
2013 Journal on Multimodal User Interfaces  
Embodied conversational agents still do not achieve the fluidity and smoothness of natural conversational interaction. One main reason is that current system often respond with big latencies and in inflexible ways. We argue that to overcome these problems, real-time conversational agents need to be based on an underlying architecture that provides two essential features for fast and fluent behavior adaptation: A close bi-directional coordination between input processing and output generation,
more » ... d incrementality of processing at both stages. We propose an architectural framework for conversational agents (ASAP) providing these two ingredients for fluid real-time conversation. The overall architectural concept is described, along with specific means of specifying incremental behavior in BML and technical implementations of different modules. We show how phenomena of fluid realtime conversation, like adapting to user feedback or smooth turn-keeping, can be realized with ASAP and we describe in detail an example real-time interaction with the implemented system.
doi:10.1007/s12193-013-0130-3 fatcat:h4mqle2vwjcm3ayvaer7wljosm