110,223 Hits in 6.5 sec

Transfer of Experience Between Reinforcement Learning Environments with Progressive Difficulty

Michael G. Madden, Tom Howley
2004 Artificial Intelligence Review  
A test domain with 15 maze environments, arranged in order of difficulty, is described.  ...  This paper describes an extension to reinforcement learning (RL), in which a standard RL algorithm is augmented with a mechanism for transferring experience gained in one problem to new but related problems  ...  Notes 1 This differs to the value of 42.07% reported in Section 4.1 because it does not include the results of Maze 2 in this case.  ... 
doi:10.1023/b:aire.0000036264.95672.64 fatcat:u567uicmp5fe7cpvxds6oo6k5i

Transferring Deep Reinforcement Learning with Adversarial Objective and Augmentation [article]

Shu-Hsuan Hsu, I-Chao Shen, Bing-Yu Chen
2018 arXiv   pre-print
The next step of intelligent agents would be able to generalize between tasks, and using prior experience to pick up new skills more quickly.  ...  Our approach enables the agents to generalize knowledge from a single source task, and boost the learning progress with a semisupervised learning method when facing a new task.  ...  Experiments In the following experiments, we evaluate the transfer effectiveness of our method using the Arcade Learning Environment (ALE) [4] .  ... 
arXiv:1809.00770v1 fatcat:zmibha7fq5c5nmypcw7ijbsy3m

Human Decision Makings on Curriculum Reinforcement Learning with Difficulty Adjustment [article]

Yilei Zeng, Jiali Duan, Yang Li, Emilio Ferrara, Lerrel Pinto, C.-C. Jay Kuo, Stefanos Nikolaidis
2022 arXiv   pre-print
It shows reinforcement learning performance can successfully adjust in sync with the human desired difficulty level.  ...  Human-centered AI considers human experiences with AI performance.  ...  We identified a phenomenon of over-fitting in auto-curriculum that leads to deteriorating performance during skill transfer with this environment.  ... 
arXiv:2208.02932v1 fatcat:6bqukbmo7fegrnwrqxhizbygd4

Curriculum Learning Based on Reward Sparseness for Deep Reinforcement Learning of Task Completion Dialogue Management

Atsushi Saito
2018 Proceedings of the 2018 EMNLP Workshop SCAI: The 2nd International Workshop on Search-Oriented Conversational AI  
Learning from sparse and delayed reward is a central issue in reinforcement learning.  ...  This curriculum makes it possible to learn dialogue management for sets of user goals with large number of slots.  ...  The proposed method trains progressive neural networks to transfer knowledge across sets of user goals, and a theoretical relationship between transfer learning and curriculum learning is studied in (  ... 
doi:10.18653/v1/w18-5707 dblp:conf/emnlp/Saito18 fatcat:7xxn6g7ssrdz3lwsuynsuheqja

Generating Automatic Curricula via Self-Supervised Active Domain Randomization [article]

Sharath Chandra Raparthy, Bhairav Mehta, Florian Golemo, Liam Paull
2020 arXiv   pre-print
Our results show that a curriculum of co-evolving the environment difficulty together with the difficulty of goals set in each environment provides practical benefits in the goal-directed tasks tested.  ...  Goal-directed Reinforcement Learning (RL) traditionally considers an agent interacting with an environment, prescribing a real-valued reward to an agent proportional to the completion of some goal.  ...  Acknowledgements The authors gratefully acknowledge the Natural Sciences and Engineering Research Council of Canada (NSERC), the Fonds de Recherche Nature et Technologies Quebec (FQRNT), Calcul Quebec,  ... 
arXiv:2002.07911v2 fatcat:7csdcamosjejdbajospd2z6xbm

Dex: Incremental Learning for Complex Environments in Deep Reinforcement Learning [article]

Nick Erickson, Qi Zhao
2017 arXiv   pre-print
This paper introduces Dex, a reinforcement learning environment toolkit specialized for training and evaluation of continual learning methods as well as general reinforcement learning problems.  ...  We also present the novel continual learning method of incremental learning, where a challenging environment is solved using optimal weight initialization learned from first solving a similar easier environment  ...  Recent work has been done with Progressive Neural Networks [17] , where transfer learning was used to apply positive transfer in a variety of reinforcement learning domains.  ... 
arXiv:1706.05749v1 fatcat:zbozkc3jvravpcv2iw3cgwzz5q

How Transferable are the Representations Learned by Deep Q Agents? [article]

Jacob Tyo, Zachary Lipton
2020 arXiv   pre-print
While for DRL agents, the distinction between representation and policy may not be clear, we seek new insight through a set of transfer learning experiments.  ...  In this paper, we consider the source of Deep Reinforcement Learning (DRL)'s sample complexity, asking how much derives from the requirement of learning useful representations of environment states and  ...  Taylor & Stone (2009) provides a survey of transfer learning techniques in reinforcement learning.  ... 
arXiv:2002.10021v1 fatcat:gkppubfgljgclk2dmxz3hkx7fi

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey [article]

Sanmit Narvekar and Bei Peng and Matteo Leonetti and Jivko Sinapov and Matthew E. Taylor and Peter Stone
2020 arXiv   pre-print
To address this problem, transfer learning has been applied to reinforcement learning such that experience gained in one task can be leveraged when starting to learn the next, harder task.  ...  Despite many advances over the past three decades, learning in many domains still requires a large amount of interaction with the environment, which can be prohibitively expensive in realistic scenarios  ...  Part of this work has taken place in the Learning Agents Research Group (LARG) at the Artificial Intelligence Laboratory, The University of Texas at Austin. LARG re-  ... 
arXiv:2003.04960v2 fatcat:iacmqeb7jjeezpo27jsnzuqb7u

Adaptive Curriculum Generation from Demonstrations for Sim-to-Real Visuomotor Control [article]

Lukas Hermann, Max Argus, Andreas Eitel, Artemij Amiranashvili, Wolfram Burgard, Thomas Brox
2020 arXiv   pre-print
We propose Adaptive Curriculum Generation from Demonstrations (ACGD) for reinforcement learning in the presence of sparse rewards.  ...  We show that training vision-based control policies in simulation while gradually increasing the difficulty of the task via ACGD improves the policy transfer to the real world.  ...  Reinforcement Learning In reinforcement learning an agent makes some observation (o t ) of an underlying environment state, which is used by a policy to compute an action a t = π(o t ).  ... 
arXiv:1910.07972v3 fatcat:dexm2xv3jzfn7ihrqkyixkeb2u

DDoS Traffic Control using Transfer Learning DQN with Structure Information

Shi-ming Xia, Lei Zhang, Wei Bai, Xing-yu Zhou, Zhi-song Pan
2019 IEEE Access  
Therefore, we can learn a better policy with less time consumption. Moreover, with progressive transfer learning, we can promote our method in a more complex environment.  ...  INDEX TERMS Distributed denial of service, router throttling, deep network, team structure information, multiagent reinforcement learning, progressive transfer learning.  ...  CONFLICTS OF INTEREST The authors declare that there is no conflict of interest regarding the publication of this paper.  ... 
doi:10.1109/access.2019.2923993 fatcat:rij76ip3ejab5lu542mycyelja

Sim-to-Real Robot Learning from Pixels with Progressive Nets [article]

Andrei A. Rusu, Mel Vecerik, Thomas Rothörl, Nicolas Heess, Razvan Pascanu, Raia Hadsell
2018 arXiv   pre-print
We present an early demonstration of this approach with a number of experiments in the domain of robot manipulation that focus on bridging the reality gap.  ...  We propose using progressive networks to bridge the reality gap and transfer learned policies from simulation to the real world.  ...  For real Jaco experiments, both learning rates and entropy costs were optimized separately using a simulated transfer experiment with a single-threaded agent (A2C).  ... 
arXiv:1610.04286v2 fatcat:xmbjazlxxnd65ag27igitdmppa

Sim-to-Real Transfer in Deep Reinforcement Learning for Robotics: a Survey [article]

Wenshuai Zhao, Jorge Peña Queralta, Tomi Westerlund
2020 arXiv   pre-print
Nonetheless, the gap between the simulated and real worlds degrades the performance of the policies once the models are transferred into real robots.  ...  Deep reinforcement learning has recently seen huge success across multiple areas in the robotics domain.  ...  ACKNOWLEDGEMENTS This work was supported by the Academy of Finland's AutoSOS project with grant number 328755.  ... 
arXiv:2009.13303v1 fatcat:7xjickbrh5avlohasquqyxlhrq

SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning [article]

Jiaqi Xu, Bin Li, Bo Lu, Yun-Hui Liu, Qi Dou, Pheng-Ann Heng
2021 arXiv   pre-print
The existing learning-based simulation platforms for medical robots suffer from limited scenarios and simplified physical interactions, which degrades the real-world performance of learned policies.  ...  Recent learning-based methods, especially reinforcement learning (RL) based methods, achieve promising performance for dexterous manipulation, which usually requires the simulation to collect data efficiently  ...  SurRoL provides dVRK compatible simulation environments for surgical robot learning (left), with Gym-like interfaces for reinforcement learning algorithm development and ranges of surgical contents with  ... 
arXiv:2108.13035v1 fatcat:2qwmngvxqnedncqu5w253mvmie

Procedural Content Generation: Better Benchmarks for Transfer Reinforcement Learning [article]

Matthias Müller-Brockhausen, Mike Preuss, Aske Plaat
2021 arXiv   pre-print
The idea of transfer in reinforcement learning (TRL) is intriguing: being able to transfer knowledge from one problem to another problem without learning everything from scratch.  ...  Promising approaches merge deep learning with planning via MCTS or introduce memory through LSTMs. (3) The lack of benchmarking tools will be remedied to enable meaningful comparison and measure progress  ...  Transfer reinforcement learning should continue to focus on transfer of network parameters. (2) The large diversity in applications and methods makes progress comparisons difficult.  ... 
arXiv:2105.14780v1 fatcat:wribk6g67nbxfa3xqhvy3rlg2q

Active Domain Randomization [article]

Bhairav Mehta, Manfred Diaz, Florian Golemo, Christopher J. Pal, Liam Paull
2019 arXiv   pre-print
Our experiments show that domain randomization may lead to suboptimal, high-variance policies, which we attribute to the uniform sampling of environment parameters.  ...  Our method looks for the most informative environment variations within the given randomization ranges by leveraging the discrepancies of policy rollouts in randomized and reference environment instances  ...  In addition, the authors would like to thank Kyle Kastner and members of the REAL Lab for their helpful comments.  ... 
arXiv:1904.04762v2 fatcat:lik3ed3otjcaflene2i5sftv4y
« Previous Showing results 1 — 15 out of 110,223 results