Crowd-Sourced Collection of Task-Oriented Human-Human Dialogues in a Multi-domain Scenario [chapter]

Norbert Braunschweiler, Panagiotis Papadakos, Margarita Kotti, Yannis Marketakis, Yannis Tzitzikas
2019 Lecture Notes in Computer Science  
There is a lack of high-quality corpora for the purposes of training task-oriented, end-to-end dialogue systems. This paper describes a dialogue collection process which used crowd-sourcing and a Wizard-of-Oz set-up to collect written human-human dialogues for a task-oriented, multi-domain scenario. The context is a tourism agency, where users try to select the more desired hotel, restaurant, museum or shop. To respond to users, wizards were assisted by an exploratory system supporting
more » ... e-enriched Faceted Search. An important aspect was the translation of user intent to a number of actions (hard or soft-constraints) by wizards. The main goal was to collect dialogues as realistic as possible between a user and an operator, suitable for training end-to-end dialogue systems. This work describes the experiences made, the options and the decisions taken to minimize the human effort and budget, along with the tools used and developed, and describes in detail the resulting dialogue collection.
doi:10.1007/978-3-030-27947-9_34 fatcat:z277rreqprexjggcgxt3pnkqoq