Towards building intelligent speech interfaces through the use of more flexible, robust and natural dialogue management solutions

Fernando Fernández-Martı´nez, J. Ferreiros, J.M. Lucas-Cuesta, J.M. Montero-Martı´nez, R. San-Segundo, R. Córdoba
2012 Interacting with computers  
In this paper a Bayesian Networks-based solution for dialogue modelling is presented. This solution is combined with carefully designed contextual information handling strategies. With the purpose of validating these solutions, and introducing a spoken dialogue system for controlling a Hi-Fi audio system as the selected prototype, a real-user evaluation has been conducted. Two different versions of the prototype are compared. Each version corresponds to a different implementation of the
more » ... m for the management of the actuation order, the algorithm for deciding the proper order to carry out the actions required by the user. The evaluation is carried out in terms of a battery of both subjective and objective metrics collected from speakers interacting with the Hi-Fi audio box through predefined scenarios. Defined metrics have been specifically adapted to measure: first, the usefulness and the actual relevance of the proposed solutions, and, secondly, their joint performance through their intelligent combination mainly measured as the level achieved with regard to the user satisfaction. A thorough and comprehensive study of the main differences between both approaches is presented. Two-way analysis of variance (ANOVA) tests are also included to measure the effects of both: the system used and the type of scenario factors, simultaneously. Finally, the effect of bringing this flexibility, robustness and naturalness into our home dialogue system is also analyzed through the results obtained. These results show that the intelligence of our speech interface has been well perceived, highlighting its excellent ease of use and its good acceptance by users, therefore validating the approached dialogue management solutions and demonstrating that a more natural, flexible and robust dialogue is possible thanks to them.
doi:10.1016/j.intcom.2012.09.003 fatcat:d4swyku3gfeszej2ehmcterdva