Filters








98,286 Hits in 6.3 sec

Reinforcement Learning With Human Advice: A Survey

Anis Najar, Mohamed Chetouani
2021 Frontiers in Robotics and AI  
In this paper, we provide an overview of the existing methods for integrating human advice into a reinforcement learning process.  ...  We first propose a taxonomy of the different forms of advice that can be provided to a learning agent.  ...  ACKNOWLEDGMENTS This work was supported by the Romeo2 project. This manuscript has been released as a pre-print at arXiv (Najar and Chetouani, 2020) .  ... 
doi:10.3389/frobt.2021.584075 pmid:34141726 pmcid:PMC8205518 fatcat:fqipip7cp5hvlo22xsonqeqzcq

Reinforcement learning with human advice: a survey [article]

Anis Najar, Mohamed Chetouani
2020 arXiv   pre-print
In this paper, we provide an overview of the existing methods for integrating human advice into a Reinforcement Learning process.  ...  We first propose a taxonomy of the different forms of advice that can be provided to a learning agent.  ...  Acknowledgments This work was supported by the Romeo2 project.  ... 
arXiv:2005.11016v2 fatcat:kvomaemvrzfq3lebfewnn4rdqq

A fast hybrid reinforcement learning framework with human corrective feedback

Carlos Celemin, Javier Ruiz-del-Solar, Jens Kober
2018 Autonomous Robots  
Learning from human corrective advice COrrective Advice Communicated by Humans (COACH) was proposed for training agents interactively during task execution (Celemin and Ruiz-del Solar 2015) .  ...  Corrective feedback has been used in Argall et al. (2008 Argall et al. ( , 2011)) , wherein policies for continuous action problems are learned from human corrective advice; this kind of feedback also  ... 
doi:10.1007/s10514-018-9786-6 fatcat:ll72c75pbja73ly23yt4j547qy

Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks [article]

Rodrigo Pérez-Dattari, Carlos Celemin, Javier Ruiz-del-Solar and Jens Kober
2018 arXiv   pre-print
We combine Deep Learning with the COrrective Advice Communicated by Humans (COACH) framework, in which non-expert humans shape policies by correcting the agent's actions during execution.  ...  successfully learn policies for continuous action spaces like in the Car Racing and Cart-Pole problems faster than with DRL.  ...  Acknowledgements This work was partially funded by FONDECYT Project 1161500.  ... 
arXiv:1810.00466v1 fatcat:m2cty6zh55e3vgj2ycmo4tk4ce

A Conceptual Framework for Externally-influenced Agents: An Assisted Reinforcement Learning Review [article]

Adam Bignold, Francisco Cruz, Matthew E. Taylor, Tim Brys, Richard Dazeley, Peter Vamplew, Cameron Foale
2020 arXiv   pre-print
These include heuristic reinforcement learning, interactive reinforcement learning, learning from demonstration, transfer learning, and learning from multiple sources, among others.  ...  In this work, we propose a conceptual framework and taxonomy for assisted reinforcement learning, aimed at fostering such collaboration by classifying and comparing various methods that use external information  ...  Acknowledgments This work has been partially supported by the Australian Government Research Training Program (RTP) and the RTP Fee-Offset Scholarship through Federation University Australia.  ... 
arXiv:2007.01544v1 fatcat:iepfl62fyfhudghvjjqhuunjqq

Multi-Channel Interactive Reinforcement Learning for Sequential Tasks

Dorothea Koert, Maximilian Kircher, Vildan Salikutluk, Carlo D'Eramo, Jan Peters
2020 Frontiers in Robotics and AI  
We believe that the findings from this experimental evaluation can be beneficial for the future design of algorithms and interfaces of interactive reinforcement learning systems used by inexperienced users  ...  Nevertheless, there is a lack of experimental evaluations of multi-channel interactive reinforcement learning systems solving robotic tasks with input from inexperienced human users, in particular for  ...  FUNDING This work was funded by the German Federal Ministry of Education and Research (BMBF) project 16SV7984 and by ERC StG 640554.  ... 
doi:10.3389/frobt.2020.00097 pmid:33501264 pmcid:PMC7805623 fatcat:opgd5kb3offm7j3fexz2og76wq

Interactive Learning with Corrective Feedback for Policies Based on Deep Neural Networks [chapter]

Rodrigo Pérez-Dattari, Carlos Celemin, Javier Ruiz-del-Solar, Jens Kober
2020 Distributed Autonomous Robotic Systems  
We combine Deep Learning with the COrrective Advice Communicated by Humans (COACH) framework, in which non-expert humans shape policies by correcting the agent's actions during execution.  ...  successfully learn policies for continuous action spaces like in the Car Racing and Cart-Pole problems faster than with DRL.  ...  This work was partially funded by FONDECYT Project 1161500 and CONICYT/PIA Project AFB180004.  ... 
doi:10.1007/978-3-030-33950-0_31 fatcat:icyg3pv7vzgnjgop6jw3luapie

Continuous Control for High-Dimensional State Spaces: An Interactive Learning Approach [article]

Rodrigo Pérez-Dattari, Carlos Celemin, Javier Ruiz-del-Solar, Jens Kober
2019 arXiv   pre-print
D-COACH is a Deep Learning based extension of COACH (COrrective Advice Communicated by Humans), where humans are able to shape policies through corrective advice.  ...  In this context, we analyze the use of human corrective feedback during task execution to learn policies with high-dimensional state spaces, by using the D-COACH framework, and we propose new variants  ...  Our work extends D-COACH [10] , which is a Deep Learning (DL) based extension of the COrrective Advice Communicated by Humans (COACH) framework [9] .  ... 
arXiv:1908.05256v1 fatcat:fmatudbbnrdo3emabvbsribd7m

Reinforcement learning of motor skills using Policy Search and human corrective advice

Carlos Celemin, Guilherme Maeda, Javier Ruiz-del-Solar, Jan Peters, Jens Kober
2019 The international journal of robotics research  
In this work, we propose the use of human corrective advice in the actions domain for learning motor trajectories.  ...  Some reinforcement learning methods, like Policy Search, offer stable convergence toward locally optimal solutions, whereas interactive machine learning or learning-from-demonstration methods allow fast  ...  Rui Silva, Manuela Veloso at Carnegie Mellon University, and Dorothea Koert and Marco Ewerton at Technische Universität Darmstadt for their constructive and valuable discussions during the development of  ... 
doi:10.1177/0278364919871998 fatcat:bacdjo34p5c3rfj45z3wmeg2w4

Continuous Control for High-Dimensional State Spaces: An Interactive Learning Approach

Rodrigo Perez-Dattari, Carlos Celemin, Javier Ruiz-del-Solar, Jens Kober
2019 2019 International Conference on Robotics and Automation (ICRA)  
D-COACH is a Deep Learning based extension of COACH (COrrective Advice Communicated by Humans), where humans are able to shape policies through corrective advice.  ...  In this context, we analyze the use of human corrective feedback during task execution to learn policies with highdimensional state spaces, by using the D-COACH framework, and we propose new variants of  ...  Our work extends D-COACH [10] , which is a Deep Learning (DL) based extension of the COrrective Advice Communicated by Humans (COACH) framework [9] .  ... 
doi:10.1109/icra.2019.8793675 dblp:conf/icra/Perez-DattariCR19 fatcat:a32c2q6n4fcuzm4jyibfgkcnpy

Verification and Synthesis of Human-Robot Interaction (Dagstuhl Seminar 19081)

Rachid Alami, Kerstin I. Eder, Guy Hoffman, Hadas Kress-Gazit, Michael Wagner
2019 Dagstuhl Reports  
This seminar brought together researchers from two distinct communities -Formal Methods for Robotics, and Human-Robot Interaction -to discuss the path towards creating safe and verifiable autonomous systems  ...  This report documents the program and the outcomes of Dagstuhl Seminar 19081 "Verification and Synthesis of Human-Robot Interaction".  ...  of human behavior in the field of Human Robot Interaction (HRI) and the models used by the formal methods community.  ... 
doi:10.4230/dagrep.9.2.91 dblp:journals/dagstuhl-reports/AlamiEHK19 fatcat:vcka4bb2uvgtvf4mdgjdde52e4

Persistent Rule-based Interactive Reinforcement Learning [article]

Adam Bignold and Francisco Cruz and Richard Dazeley and Peter Vamplew and Cameron Foale
2021 arXiv   pre-print
Interactive reinforcement learning has allowed speeding up the learning process in autonomous agents by including a human trainer providing extra information to the agent in real-time.  ...  Our experimental results show persistent advice substantially improves the performance of the agent while reducing the number of interactions required for the trainer.  ...  Acknowledgments This work has been partially supported by the Australian Government Research Training Program (RTP) and the RTP Fee-Offset Scholarship through Federation University Australia.  ... 
arXiv:2102.02441v2 fatcat:ds6myvafkbbt3c3vu5nh47ttei

A Broad-persistent Advising Approach for Deep Interactive Reinforcement Learning in Robotic Environments [article]

Hung Son Nguyen, Francisco Cruz, Richard Dazeley
2021 arXiv   pre-print
Deep Interactive Reinforcement Learning (DeepIRL) includes interactive feedback from an external trainer or expert giving advice to help learners choosing actions to speed up the learning process.  ...  However, current research has been limited to interactions that offer actionable advice to only the current state of the agent.  ...  Human advice, on the other hand, is not 100% correct [23] .  ... 
arXiv:2110.08003v2 fatcat:g3cnl3cczra6jcbceajqtv6fd4

Deep Reinforcement Learning with Interactive Feedback in a Human–Robot Environment

Ithan Moreira, Javier Rivas, Francisco Cruz, Richard Dazeley, Angel Ayala, Bruno Fernandes
2020 Applied Sciences  
A plausible approach to address this issue is interactive feedback, where a trainer advises a learner on which actions should be taken from specific states to speed up the learning process.  ...  deep reinforcement learning using a previously trained artificial agent as an advisor (agent–IDeepRL); and (iii) interactive deep reinforcement learning using a human advisor (human–IDeepRL).  ...  There are different ways of using the budget, in terms of when to interact or give advice, namely, early advising, alternating advice, importance advising, mistake correcting, and predictive advising.  ... 
doi:10.3390/app10165574 fatcat:sbq7s3iuknhrdb65kuaz5wyoxq

Deep Reinforcement Learning with Interactive Feedback in a Human-Robot Environment [article]

Ithan Moreira, Javier Rivas, Francisco Cruz, Richard Dazeley, Angel Ayala, Bruno Fernandes
2020 arXiv   pre-print
A plausible approach to address this issue is interactive feedback, where a trainer advises a learner on which actions should be taken from specific states to speed up the learning process.  ...  deep reinforcement learning using a previously trained artificial agent as an advisor (agent-IDeepRL); and (iii) interactive deep reinforcement learning using a human advisor (human-IDeepRL).  ...  Acknowledgments This research was partially funded by Universidad Central de Chile under the research project CIP2018009, the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior -Brasil (CAPES)  ... 
arXiv:2007.03363v2 fatcat:eidnzkx3hncorazkjtjcyik354
« Previous Showing results 1 — 15 out of 98,286 results