147 Hits in 4.9 sec

Teacher algorithms for curriculum learning of Deep RL in continuously parameterized environments [article]

Rémy Portelas, Cédric Colas, Katja Hofmann, Pierre-Yves Oudeyer
2019 arXiv   pre-print
We consider the problem of how a teacher algorithm can enable an unknown Deep Reinforcement Learning (DRL) student to become good at a skill over a wide range of diverse environments.  ...  Using parameterized variants of the BipedalWalker environment, we study their efficiency to personalize a learning curriculum for different learners (embodiments), their robustness to the ratio of learnable  ...  . • First study of ALP-based teacher algorithms leveraged to scaffold the learning of generalist DRL agents in continuously parameterized environments. See Sec. 5.  ... 
arXiv:1910.07224v1 fatcat:2lfkj5aj7jewjiz6lugjzhhtyy

Curriculum-Based Deep Reinforcement Learning for Adaptive Robotics: A Mini-Review

Gupta Kashish, Najjaran Homayoun
2021 International Journal of Robotic Engineering  
At the same time, the introduction of curriculum learning has made the reinforcement learning process significantly more efficient and allowed for training on much broader tasks.  ...  Recent progress in deep reinforcement learning has corroborated to its potential to train such autonomous and robust agents.  ...  Portelas R, Colas C, Hofmann K, Oudeyer PY (2019) Teacher algorithms for curriculum learning of Deep RL in continuously parameterized environments. 19.  ... 
doi:10.35840/2631-5106/4131 fatcat:tnoa4vd4yrgnpjzesxr5a3jq2m

DCUR: Data Curriculum for Teaching via Samples with Reinforcement Learning [article]

Daniel Seita, Abhinav Gopal, Zhao Mandi, John Canny
2021 arXiv   pre-print
We propose a framework, Data CUrriculum for Reinforcement learning (DCUR), which first trains teachers using online deep RL, and stores the logged environment interaction history.  ...  We test teachers and students using state-of-the-art deep RL algorithms across a variety of data curricula.  ...  In Deep RL, the policy π θ is parameterized by a deep neural network with parameters θ.  ... 
arXiv:2109.07380v1 fatcat:ijzrauhisbgxdgyzfqsrbjdqnm

Curriculum in Gradient-Based Meta-Reinforcement Learning [article]

Bhairav Mehta, Tristan Deleu, Sharath Chandra Raparthy, Chris J. Pal, Liam Paull
2020 arXiv   pre-print
meta-RL in a similar as ADR does for sim2real transfer.  ...  In this work, we begin by highlighting intriguing failure cases of gradient-based meta-RL and show that task distributions can wildly affect algorithmic outputs, stability, and performance.  ...  BM would like to thank Glen Berseth for helpful discussions in early drafts of this work, and IVADO for financial support.  ... 
arXiv:2002.07956v1 fatcat:htcm2ijusvgi7dw6ypjca4htde

Automatic Curriculum Learning With Over-repetition Penalty for Dialogue Policy Learning [article]

Yangyang Zhao, Zhenyu Wang, Zhenhua Huang
2020 arXiv   pre-print
for automatic curriculum learning.  ...  We propose a novel framework, Automatic Curriculum Learning-based Deep Q-Network (ACL-DQN), which replaces the traditional random sampling method with a teacher policy model to realize the dialogue policy  ...  We thank the anonymous reviewers for their insightful feedback on the work, and we would like to acknowledge to volunteers from South China University of Technology for helping us with the human experiments  ... 
arXiv:2012.14072v1 fatcat:peand5lvsjhytjiny4djpe5kxe

Trying AGAIN instead of Trying Longer: Prior Learning for Automatic Curriculum Learning [article]

Rémy Portelas and Katja Hofmann and Pierre-Yves Oudeyer
2020 arXiv   pre-print
A major challenge in the Deep RL (DRL) community is to train agents able to generalize over unseen situations, which is often approached by training them on a diversity of tasks (or environments).  ...  We address this problem by proposing a two stage ACL approach where 1) a teacher algorithm first learns to train a DRL agent with a high-exploration curriculum, and then 2) distills learned priors from  ...  Teacher algorithms for cur- riculum learning of deep rl in continuously parameterized environments, 2019.  ... 
arXiv:2004.03168v1 fatcat:m2eyobgs2zhrzinhyuo2qog26q

Mutual Information Based Knowledge Transfer Under State-Action Dimension Mismatch [article]

Michael Wan, Tanmay Gangwani, Jian Peng
2020 arXiv   pre-print
Deep reinforcement learning (RL) algorithms have achieved great success on a wide variety of sequential decision-making tasks.  ...  In this paper, we propose a new framework for transfer learning where the teacher and the student can have arbitrarily different state- and action-spaces.  ...  INTRODUCTION Deep reinforcement learning (RL), which combines the rigor of RL algorithms with the flexibility of universal function approximators such as deep neural networks, has demonstrated a plethora  ... 
arXiv:2006.07041v1 fatcat:5wy2n322fvfhrleaowjy2xjs3e

Safe Reinforcement Learning via Curriculum Induction [article]

Matteo Turchetta, Andrey Kolobov, Shital Shah, Andreas Krause, Alekh Agarwal
2021 arXiv   pre-print
Based on observing agents' progress, the teacher itself learns a policy for choosing the reset controllers, a curriculum, to optimize the agent's final policy reward.  ...  Our experiments use this framework in two environments to induce curricula for safe and efficient learning.  ...  This work was supported by the Max Planck ETH Center for Learning Systems.  ... 
arXiv:2006.12136v2 fatcat:nrk22liksfdg3derjntf7mqgum

Generating Automatic Curricula via Self-Supervised Active Domain Randomization [article]

Sharath Chandra Raparthy, Bhairav Mehta, Florian Golemo, Liam Paull
2020 arXiv   pre-print
Our results show that a curriculum of co-evolving the environment difficulty together with the difficulty of goals set in each environment provides practical benefits in the goal-directed tasks tested.  ...  Goal-directed RL has seen large gains in sample efficiency, due to the ease of reusing or generating new experience by proposing goals.  ...  Acknowledgements The authors gratefully acknowledge the Natural Sciences and Engineering Research Council of Canada (NSERC), the Fonds de Recherche Nature et Technologies Quebec (FQRNT), Calcul Quebec,  ... 
arXiv:2002.07911v2 fatcat:7csdcamosjejdbajospd2z6xbm

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey [article]

Sanmit Narvekar and Bei Peng and Matteo Leonetti and Jivko Sinapov and Matthew E. Taylor and Peter Stone
2020 arXiv   pre-print
Finally, we use our framework to find open problems and suggest directions for future RL curriculum learning research.  ...  In this article, we present a framework for curriculum learning (CL) in reinforcement learning, and use it to survey and classify existing CL methods in terms of their assumptions, capabilities, and goals  ...  Part of this work has taken place in the Learning Agents Research Group (LARG) at the Artificial Intelligence Laboratory, The University of Texas at Austin. LARG re-  ... 
arXiv:2003.04960v2 fatcat:iacmqeb7jjeezpo27jsnzuqb7u

Reinforcement Teaching [article]

Alex Lewandowski, Calarina Muslimani, Dale Schuurmans, Matthew E. Taylor, Jun Luo
2022 arXiv   pre-print
To demonstrate the generality of Reinforcement Teaching, we conduct experiments where a teacher learns to significantly improve both reinforcement and supervised learning algorithms, outperforming hand-crafted  ...  We develop a unifying meta-learning framework, called Reinforcement Teaching, to improve the learning process of any algorithm.  ...  ] and DoubleDQN [61; 68] were sufficient to learn adaptive teaching behaviour and leave investigation of more advanced deep RL algorithms for future work.  ... 
arXiv:2204.11897v2 fatcat:c2226jnpmfhvvjh25d3jyuseqy

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems [article]

Jack Parker-Holder, Raghu Rajan, Xingyou Song, André Biedenkapp, Yingjie Miao, Theresa Eimer, Baohe Zhang, Vu Nguyen, Roberto Calandra, Aleksandra Faust, Frank Hutter, Marius Lindauer
2022 arXiv   pre-print
The combination of Reinforcement Learning (RL) with deep learning has led to a series of impressive feats, with many believing (deep) RL provides a path towards generally capable agents.  ...  Given the diversity of methods and environments considered in RL, much of the research has been conducted in distinct subfields, ranging from meta-learning to evolution.  ...  Acknowledgements We would like to thank Jie Tan for providing feedback on the survey, as well as Sagi Perel and Daniel Golovin for valuable discussions.  ... 
arXiv:2201.03916v1 fatcat:4j2ycfj6czgxvjn7goxnhxsvzm

Investigating Value of Curriculum Reinforcement Learning in Autonomous Driving Under Diverse Road and Weather Conditions [article]

Anil Ozturk, Mustafa Burak Gunel, Resul Dagdanov, Mirac Ekim Vural, Ferhat Yurdakul, Melih Dal, Nazim Kemal Ure
2021 arXiv   pre-print
The main contribution of this paper is a systematic study for investigating the value of curriculum reinforcement learning in autonomous driving applications.  ...  Applications of reinforcement learning (RL) are popular in autonomous driving tasks.  ...  The usage of curriculum learning in recent works [7] , [8] and Teacher-Student Curriculum Learning Framework [9] shows that the curriculum learning improved performance of the training.  ... 
arXiv:2103.07903v3 fatcat:xabueihhbbhabjzoxu4yu6dwli

Adaptive Procedural Task Generation for Hard-Exploration Problems [article]

Kuan Fang, Yuke Zhu, Silvio Savarese, Li Fei-Fei
2021 arXiv   pre-print
To enable curriculum learning in the absence of a direct indicator of learning progress, we propose to train the task generator by balancing the agent's performance in the generated tasks and the similarity  ...  At the heart of our approach, a task generator learns to create tasks from a parameterized task space via a black-box procedural generation module.  ...  We would like to thank Roberto Martín-Martín, Austin Narcomey, Sriram Somasundaram, Fei Xia, and Danfei Xu for feedback on an early draft of the paper.  ... 
arXiv:2007.00350v3 fatcat:325qus6ovjhonf2fheip6sahei

Observational Learning by Reinforcement Learning [article]

Diana Borsa, Bilal Piot, Rémi Munos, Olivier Pietquin
2017 arXiv   pre-print
Through simple scenarios, we demonstrate that an RL agent can leverage the information provided by the observations of an other agent performing a task in a shared environment.  ...  The later is naturally modeled by RL, by correlating the learning agent's reward with the teacher agent's behaviour.  ...  The main questions we would want to answer are then: is (deep) RL coupled with memory enough to successfully tackle observational learning? Will the RL agent learn to ignore or leverage the teacher?  ... 
arXiv:1706.06617v1 fatcat:373blxc2rnfqvksiskyhcqezoy
« Previous Showing results 1 — 15 out of 147 results