Toward adaptive conversational interfaces

Sharon Oviatt, Courtney Darves, Rachel Coulston
2004 ACM Transactions on Computer-Human Interaction  
_____________________________________________________________________________________________________________________ The design of robust interfaces that process conversational speech is a challenging research direction largely because users' spoken language is so variable. This research explored a new dimension of speaker stylistic variation by examining whether users' speech converges systematically with the text-to-speech (TTS) heard from a software partner. To pursue this question, a study
more » ... was conducted in which twenty-four 7-to-10-yearold children conversed with animated partners that embodied different TTS voices. An analysis of children's amplitude, durational features, and dialogue response latencies confirmed that they spontaneously adapt several basic acoustic-prosodic features of their speech 10-50%, with the largest adaptations involving utterance pause structure and amplitude. Children's speech adaptations were relatively rapid, bidirectional, and dynamically readaptable when introduced to new partners, and generalized across different types of users and TTS voices. Adaptations also occurred consistently, with 70-95% of children converging with their partner's TTS, although individual differences in magnitude of adaptation were evident. In the design of future conversational systems, users' spontaneous convergence could be exploited to guide their speech within system processing bounds, thereby enhancing robustness. Adaptive system processing could yield further significant performance gains. The long-term goal of this research is the development of predictive models of human-computer communication to guide the design of new conversational interfaces.
doi:10.1145/1017494.1017498 fatcat:vwup5p5dgfaxnjvw3lb2k46khe