Predicting hyperarticulate speech during human-computer error resolution

Sharon Oviatt, Margaret MacEachern, Gina-Anne Levow
1998 Speech Communication  
When speaking to interactive systems, people sometimes hyperurticulate -or adopt a clarified form of speech that has been associated with increased recognition errors. The goals of the present study were (1) to establish a flexible simulation method for studying users' reactions to system errors, (2) to analyze the type and magnitude of linguistic adaptations in speech during human-computer error resolution, (3) to provide a unified theoretical model for interpreting and predicting users'
more » ... adaptations during system error handling, and (4) to outline the implications for developing more robust interactive systems. A semi-automatic simulation method with a novel error generation capability was developed to compare users' speech immediately before and after system recognition errors, and under conditions varying in error base-rate. Matched original-repeat utterance pairs then were analyzed for type and magnitude of linguistic adaptation. When resolving errors with a computer, it was revealed that users actively tailor their speech along a spectrum of hyperarticulation, and as a predictable reaction to their perception of the computer as an "at risk" listener. During both low and high error rates, durational changes were pervasive, including elongation of the speech segment and large relative increases in the number and duration of pauses. During a high error rate, speech also was adapted to include more hyper-clear phonological features, fewer disfluencies, and change in fundamental frequency. The two-stage CHAM model (Computer-elicited Hyperarticulate Adaptation Model) is proposed to account for these changes in users' speech during interactive error resolution. 0 19% Elsevier Science B.V. All rights reserved. RCsumC Quand ils parlent a des systemes interactifs, les utilisateurs "surarticulent" parfois, ou bien adoptent un type d'elocution, se voulant didactique, qui a et6 associe a une augmentation des erreurs de reconnaissance. Les buts de cette etude Ctaient les suivants (1) dtablir une methode de simulation flexible pour Ctudier les reactions des utilisateurs aux erreurs des systemes, (2) analyser le type et l'ampleur des adaptations linguistiques lors des resolutions d'erreurs entre l'homme et la machine, (3) foumir un module thkorique unifiC pour prkdire et interprkter les adaptations de la parole de I'utilisateur au
doi:10.1016/s0167-6393(98)00005-3 fatcat:v6x4m5xcqfcs5jb67girvmgv4e