Datasets and Benchmarks for Task-Oriented Log Dialogue Ranking Task

Xinnuo Xu, Yizhe Zhang, Lars Liden, Sungjin Lee
2020 Interspeech 2020  
Although the data-driven approaches of some recent bot building platforms make it possible for a wide range of users to easily create dialogue systems, those platforms don't offer tools for quickly identifying which log dialogues contain problems. Thus, in this paper, we (1) introduce a new task, log dialogue ranking, where the ranker places problematic dialogues higher (2) provide a collection of human-bot conversations in the restaurant inquiry task labelled with dialogue quality for ranker
more » ... aining and evaluation (3) present a detailed description of the data collection pipeline, which is entirely based on crowd-sourcing (4) finally report a benchmark result of dialogue ranking, which shows the usability of the data and sets a baseline for future studies. Index Terms: dialogue ranking, dialogue quality, language resource, dialogue system 1 2 We use the data collection toolkit offered by ParlAI
doi:10.21437/interspeech.2020-1341 dblp:conf/interspeech/XuZLL20 fatcat:4i4pg2nawbbinmt77qyk76n6ei