Filters








7,826 Hits in 1.8 sec

Clustering behaviors of Spoken Dialogue Systems users

Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefevre, Olivier Pietquin
2012 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
In order to cope with the data requirement of these methods, but also to evaluate the dialogue strategies, user simulations are built.  ...  Dialogue corpora used to build user simulation are often not annotated in user's perspective and thus can only simulate some generic user behavior, perhaps not representative of any user.  ...  K-means method with K set to 2 (since it is known that only two user behaviors are simulated) is used for clustering the discounted feature vectors.  ... 
doi:10.1109/icassp.2012.6289038 dblp:conf/icassp/ChandramohanGLP12 fatcat:5b5ebxgvlvfm5nitcpftu24haa

On-line learning of a Persian spoken dialogue system using real training data

Maryam Habibi, Hossein Sameti, Hesam Setareh
2010 10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010)  
The first spoken dialogue system developed for the Persian language is introduced. This is a ticket reservation system with Persian ASR and NLU modules.  ...  Researchers used simulated training data for the English language [2, 20, 21, 22] . The model-free approach of reinforcement learning has near-optimal solution and needs a lot of data for learning.  ...  Conventional dialogue systems use simulated training data for modeling the environment [1] .  ... 
doi:10.1109/isspa.2010.5605490 dblp:conf/isspa/HabibiSS10 fatcat:hvsr5qs3tbgexfkbydym5elseq

Cascaded LSTMs based Deep Reinforcement Learning for Goal-driven Dialogue [article]

Yue Ma, Xiaojie Wang, Zhenjiang Dong, Hong Chen
2019 arXiv   pre-print
The top part is a forward Deep Neural Network which converts dialogue embeddings into the Q-values of different dialogue actions.  ...  Experimental results show that our model outperforms both traditional Markov Decision Process (MDP) model and single LSTM with Deep Q-Network on meeting room booking tasks.  ...  Compared with dialogues with simulators, the average total rewards of dialogues with human descends less than 3 percent in both 2 and 3 slots tasks.  ... 
arXiv:1910.14229v1 fatcat:ix3brhbnfbgelhfn3fd4q5i2la

On-line Dialogue Policy Learning with Companion Teaching

Lu Chen, Runzhe Yang, Cheng Chang, Zihao Ye, Xiang Zhou, Kai Yu
2017 Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers  
Simulation experiments showed that, with a small number of human teaching dialogues, the proposed approach can effectively improve user experience at the beginning and smoothly lead to good performance  ...  Here, dialogue policy is trained using not only user's reward but also teacher's example action as well as estimated immediate reward at turn level.  ...  If the predicted action is same as the action given by the policy model, the extra reward δ discounted by the probability of the predicted action will be given to the policy model.  ... 
doi:10.18653/v1/e17-2032 dblp:conf/eacl/ChenYCYZY17 fatcat:ju6ganm46jffbholid4l3fh5hy

User and Noise Adaptive Dialogue Management Using Hybrid System Actions [chapter]

Senthilkumar Chandramohan, Olivier Pietquin
2010 Lecture Notes in Computer Science  
Our experimental results obtained using simulated users reveal that user and noise adaptive hybrid action selection can perform better than dialogue policies which can only perform simple actions.  ...  A dialogue management policy is a mapping from dialogue states to system actions, i.e. given the state of the dialogue the dialogue policy determines the next action to be performed by the dialogue manager  ...  Several works have be done in the recent past such as [14, 12, 5, 16, 19] to simulate channel noise for dialogue modeling.  ... 
doi:10.1007/978-3-642-16202-2_2 fatcat:ayaxzahnzne6vbdqk4dblashze

Bayesian Inverse Reinforcement Learning for Modeling Conversational Agents in a Virtual Environment [chapter]

Lina M. Rojas-Barahona, Christophe Cerisara
2014 Lecture Notes in Computer Science  
We show that the proposed approach converges relatively quickly and that it outperforms two baseline systems, including a dialogue manager trained to provide "locally" optimal decisions.  ...  We apply Bayesian Inverse Reinforcement Learning (BIRL) to infer this behavior in the context of a serious game, given evidence in the form of stored dialogues provided by experts who play the role of  ...  This work covers a first step towards dialogue optimization with user simulation.  ... 
doi:10.1007/978-3-642-54906-9_41 fatcat:ejkhzbrkejamtirrpg3oteiwam

Cooperative Multi-Agent Reinforcement Learning with Conversation Knowledge for Dialogue Management

Shuyu Lei, Xiaojie Wang, Caixia Yuan
2020 Applied Sciences  
To address the time-consuming development of simulator policy, we propose a multi-agent dialogue model where an end-to-end dialogue manager and a user simulator are optimized simultaneously.  ...  In cross-model evaluation with human users involved, the dialogue manager trained in one-to-many strategy achieves the best performance.  ...  This dialogue manager is optimized with deep dyna-Q with a world model and a user simulator.  ... 
doi:10.3390/app10082740 fatcat:mdqcngkjwnaypk7qgsrzfsd5ae

Autonomous Sub-domain Modeling for Dialogue Policy with Hierarchical Deep Reinforcement Learning

Giovanni Yoko Kristianto, Huiwen Zhang, Bin Tong, Makoto Iwayama, Yoshiyuki Kobayashi
2018 Proceedings of the 2018 EMNLP Workshop SCAI: The 2nd International Workshop on Search-Oriented Conversational AI  
This paper proposes a dialogue framework that autonomously models meaningful sub-domains and learns the policy over them.  ...  Our experiments show that our framework outperforms the baseline without subdomains by 11% in terms of success rate, and is competitive with that with manually defined sub-domains.  ...  or success dialogue) at the end of dialogueDiscount factor γ: 0.95 • Maximum number of turns: 30 User Simulator We used an agenda-based user simulator (Schatzmann et al., 2007) with which the belief  ... 
doi:10.18653/v1/w18-5702 dblp:conf/emnlp/KristiantoZTIK18 fatcat:2nivhou44jdizio4zxk27nuuiu

A Survey on Recent Advances and Challenges in Reinforcement Learning Methods for Task-Oriented Dialogue Policy Learning [article]

Wai-Chung Kwan, Hongru Wang, Huimin Wang, Kam-Fai Wong
2022 arXiv   pre-print
Dialogue Policy Learning is a key component in a task-oriented dialogue system (TDS) that decides the next action of the system given the dialogue state at each turn.  ...  We believe this survey can shed a light on future research in dialogue management.  ...  The main idea was to train the model to output the next system action given the dialogue context.  ... 
arXiv:2202.13675v2 fatcat:5hy7vg3cnfeqfh6lxckujnmbqq

A Benchmarking Environment for Reinforcement Learning Based Task Oriented Dialogue Management [article]

Iñigo Casanueva, Paweł Budzianowski, Pei-Hao Su, Nikola Mrkšić, Tsung-Hsien Wen, Stefan Ultes, Lina Rojas-Barahona, Steve Young, Milica Gašić
2018 arXiv   pre-print
Therefore, this paper proposes a set of challenging simulated environments for dialogue model development and evaluation.  ...  Dialogue assistants are rapidly becoming an indispensable daily aid.  ...  when taking action a t in dialogue state b t at turn t and γ is the discount factor.  ... 
arXiv:1711.11023v2 fatcat:u3ymmb2akfcgppp5jik3kqssna

SimpleDS: A Simple Deep Reinforcement Learning Dialogue System [article]

Heriberto Cuayáhuitl
2016 arXiv   pre-print
This paper presents 'SimpleDS', a simple and publicly available dialogue system trained with deep reinforcement learning.  ...  Our initial results, in the restaurant domain, show that it is indeed possible to induce reasonable dialogue behaviour with an approach that aims for high levels of automation in dialogue control for intelligent  ...  Table 1 shows an example dialogue of the learnt policy with user inputs derived from simulated speech recognition results.  ... 
arXiv:1601.04574v1 fatcat:hxq7ukppcffubpgkdhgev267ti

The Bottleneck Simulator: A Model-Based Deep Reinforcement Learning Approach

Iulian Vlad Serban, Chinnadhurai Sankar, Michael Pieper, Joelle Pineau, Yoshua Bengio
2020 The Journal of Artificial Intelligence Research  
To this end, we propose the Bottleneck Simulator: a model-based reinforcement learning method which combines a learned, factorized transition model of the environment with rollout simulations to learn  ...  Finally, we evaluate the Bottleneck Simulator on two natural language processing tasks: a text adventure game and a real-world, complex dialogue response selection task.  ...  We show human evaluators a dialogue along with 4 candidate responses, and ask them to 7. www.evi.com.  ... 
doi:10.1613/jair.1.12463 fatcat:2atgr7pihjfrzj4eypbokbhzzq

Subgoal Discovery for Hierarchical Dialogue Policy Learning

Da Tang, Xiujun Li, Jianfeng Gao, Chong Wang, Lihong Li, Tony Jebara
2018 Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing  
Experiments with simulated and real users show that our approach performs competitively against a state-of-theart method that requires human-defined subgoals.  ...  We demonstrate our method by building a dialogue agent for the composite task of travel planning.  ...  Most of this work was done while DT, CW & LL were with Microsoft.  ... 
doi:10.18653/v1/d18-1253 dblp:conf/emnlp/TangLGW0J18 fatcat:ml7twixbenbnhlpm64kcczxvom

The Bottleneck Simulator: A Model-based Deep Reinforcement Learning Approach [article]

Iulian Vlad Serban, Chinnadhurai Sankar, Michael Pieper, Joelle Pineau, Yoshua Bengio
2018 arXiv   pre-print
To this end, we propose the Bottleneck Simulator: a model-based reinforcement learning method which combines a learned, factorized transition model of the environment with rollout simulations to learn  ...  Finally, we evaluate the Bottleneck Simulator on two natural language processing tasks: a text adventure game and a real-world, complex dialogue response selection task.  ...  State Abstraction: a tabular state-action-value function policy trained with discounted Q-learning on rollouts from the Bottleneck Simulator environment model, with abstract policy state space Z = Z Dialogue  ... 
arXiv:1807.04723v1 fatcat:ftbcm676dzg2bi7ljwbadjq32e

Subgoal Discovery for Hierarchical Dialogue Policy Learning [article]

Da Tang and Xiujun Li and Jianfeng Gao and Chong Wang and Lihong Li and Tony Jebara
2018 arXiv   pre-print
Experiments with simulated and real users show that our approach performs competitively against a state-of-the-art method that requires human-defined subgoals.  ...  We demonstrate our method by building a dialogue agent for the composite task of travel planning.  ...  Most of this work was done while DT, CW & LL were with Microsoft.  ... 
arXiv:1804.07855v3 fatcat:t3r3nr24xbhsjjx5chw3macoi4
« Previous Showing results 1 — 15 out of 7,826 results