69,982 Hits in 9.0 sec

Improving Sample Efficiency in Model-Free Reinforcement Learning from Images [article]

Denis Yarats, Amy Zhang, Ilya Kostrikov, Brandon Amos, Joelle Pineau, Rob Fergus
2020 arXiv   pre-print
Training an agent to solve control tasks directly from high-dimensional images with model-free reinforcement learning (RL) has proven difficult.  ...  This results in a simple approach capable of matching state-of-the-art model-free and model-based algorithms on MuJoCo control tasks.  ...  These model-based reinforcement learning methods often show improved sample efficiency, but with the additional complexity of balancing various auxiliary losses, such as a dynamics loss, reward loss, and  ... 
arXiv:1910.01741v3 fatcat:m54hggrwpbed3o6a5auztbf2du

Sample-efficient Reinforcement Learning Representation Learning with Curiosity Contrastive Forward Dynamics Model [article]

Thanh Nguyen, Tung M. Luu, Thang Vu, Chang D. Yoo
2021 arXiv   pre-print
This paper considers a learning framework for Curiosity Contrastive Forward Dynamics Model (CCFDM) in achieving a more sample-efficient RL based directly on raw pixels.  ...  towards improving sample efficiency and generalization.  ...  Therein, combination of contrastive learning and data augmentation techniques from computer vision with model-free RL show certain improvements in sample efficiency on common RL benchmarks CCFDM trains  ... 
arXiv:2103.08255v2 fatcat:cnp3ep4ajfgezdtfw5agsm3rua

CURL: Contrastive Unsupervised Representations for Reinforcement Learning [article]

Aravind Srinivas, Michael Laskin, Pieter Abbeel
2020 arXiv   pre-print
On the DeepMind Control Suite, CURL is the first image-based algorithm to nearly match the sample-efficiency of methods that use state-based features.  ...  CURL outperforms prior pixel-based methods, both model-based and model-free, on complex tasks in the DeepMind Control Suite and Atari Games showing 1.9x and 1.2x performance gains at the 100K environment  ...  Improving sample efficiency in model- free reinforcement learning from images. arXiv preprint arXiv:1910.01741, 2019. Table 3 . 3 Hyperparameters used for DMControl CURL experiments.  ... 
arXiv:2004.04136v4 fatcat:fek5n6xsn5f23efn2anivekvde

Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning [article]

Denis Yarats, Rob Fergus, Alessandro Lazaric, Lerrel Pinto
2021 arXiv   pre-print
We present DrQ-v2, a model-free reinforcement learning (RL) algorithm for visual continuous control.  ...  Notably, DrQ-v2 is able to solve complex humanoid locomotion tasks directly from pixel observations, previously unattained by model-free RL.  ...  Acknowledgements This research is supported in part by DARPA through the Machine Common Sense Program.  ... 
arXiv:2107.09645v1 fatcat:37obrkfuubc4fidzyy5d6xhzne

Integrating Contrastive Learning with Dynamic Models for Reinforcement Learning from Images

Bang You, Oleg Arenz, Youping Chen, Jan Peters
2022 Neurocomputing  
In particular, methods based on contrastive learning that induce linearity of the latent dynamics or invariance to data augmentation have been shown to greatly improve the sample efficiency of the reinforcement  ...  Recent methods for reinforcement learning from images use auxiliary tasks to learn image features that are used by the agent's policy or Q-function.  ...  [4, 5] substantially improved the sample efficiency of image-based reinforcement learning, by learning a predictive model on a learned latent embedding of the state.  ... 
doi:10.1016/j.neucom.2021.12.094 fatcat:xiwbnhs545axlbzccj4rgry2ui

Self-supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation [article]

Gregory Kahn, Adam Villaflor, Bosen Ding, Pieter Abbeel, Sergey Levine
2018 arXiv   pre-print
We then instantiate this graph to form a navigation model that learns from raw images and is sample efficient.  ...  In contrast, learning-based methods improve as the robot acts in the environment, but are difficult to deploy in the real-world due to their high sample complexity.  ...  This research was funded in part by the Army Research Office through the MAST program, the National Science Foundation through IIS-1614653, NVIDIA, and the Berkeley Deep Drive consortium.  ... 
arXiv:1709.10489v3 fatcat:gojv4oqs7ze35noymyti46il2i

An Efficient and Accurate DDPG-based Recurrent Attention Model for Object Localization

Fengkai Ke
2020 IEEE Access  
To overcome this shortcoming, in order to improve the learning efficiency and stability of RAM, this paper proposes DDPG-based RAM.  ...  His main research interests include deep learning, reinforcement learning, medical image processing, and so on.  ...  RAM integrates the strong nonlinear fitting ability of neural network and the advantages of model-free learning in reinforcement learning. Reinforcement learning is an essential part of RAM.  ... 
doi:10.1109/access.2020.3008171 fatcat:hr3e2i5qdvfxppzcryp5m7becm

Improving Safety in Reinforcement Learning Using Model-Based Architectures and Human Intervention [article]

Bharat Prakash, Mohit Khatwani, Nicholas Waytowich, Tinoosh Mohsenin
2019 arXiv   pre-print
Recent progress in AI and Reinforcement learning has shown great success in solving complex problems with high dimensional state spaces.  ...  We present a hybrid method for reducing the human intervention time by combining model-based approaches and training a supervised learner to improve sample efficiency while also ensuring safety.  ...  The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the U.S. Government.  ... 
arXiv:1903.09328v1 fatcat:b2zsyziyprc7rgiukw4durauzu

Imaginary Hindsight Experience Replay: Curious Model-based Learning for Sparse Reward Tasks [article]

Robert McCarthy, Stephen J. Redmond
2021 arXiv   pre-print
Model-based reinforcement learning is a promising learning strategy for practical robotic applications due to its improved data-efficiency versus model-free counterparts.  ...  Upon evaluation, this approach provides an order of magnitude increase in data-efficiency on average versus the state-of-the-art model-free method in the benchmark OpenAI Gym Fetch Robotics tasks.  ...  Since the model can be learned in an entirely supervised manner, the primary advantage of model-based RL is a significant improvement in sample efficiency.  ... 
arXiv:2110.02414v1 fatcat:wg2gdug74ngwtel4wzvacxdabq

Deep Model-Based Reinforcement Learning for High-Dimensional Problems, a Survey [article]

Aske Plaat, Walter Kosters, Mike Preuss
2020 arXiv   pre-print
Model-based reinforcement learning creates an explicit model of the environment dynamics to reduce the need for environment samples.  ...  Unfortunately, the sample complexity of most deep reinforcement learning methods is high, precluding their use in some important applications.  ...  ACKNOWLEDGMENTS We thank the members of the Leiden Reinforcement Learning Group, and especially Thomas Moerland and Mike Huisman, for many discussions and insights.  ... 
arXiv:2008.05598v2 fatcat:5xmwmemv5bfinkw57avf5ghhxq

Mastering Atari with Discrete World Models [article]

Danijar Hafner, Timothy Lillicrap, Mohammad Norouzi, Jimmy Ba
2022 arXiv   pre-print
World models facilitate such generalization and allow learning behaviors from imagined outcomes to increase sample-efficiency.  ...  We introduce DreamerV2, a reinforcement learning agent that learns behaviors purely from predictions in the compact latent space of a powerful world model.  ...  The Reactor: A Fast and Sample- Efficient Actor-Critic Agent for Reinforcement Learning. ArXiv Preprint ArXiv:1704.04651, 2017. D Ha J Schmidhuber. World Models.  ... 
arXiv:2010.02193v4 fatcat:7rjazcce75e7tbky5gpemuaqbq

A Gamified Assessment Platform for Predicting the Risk of Dementia +Parkinson's disease (DPD) Co-Morbidity

Zhiwei Zeng, Hongchao Jiang, Yanci Zhang, Zhiqi Shen, Jun Ji, Martin J. Mckeown, Jing Jih Chin, Cyril Leung, Chunyan Miao
2020 Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence  
We investigate useful in-game behaviour markers which can support machine learning-based predictive analytics on seniors' risk of developing DPD co-morbidity.  ...  As people live longer, they also tend to suffer from more challenging medical conditions.  ...  Discussion In this work, I plan to propose representation learning methods to obtain compact state space from raw observations to improve sample efficiency of DRL.  ... 
doi:10.24963/ijcai.2020/748 dblp:conf/ijcai/Zhu20 fatcat:tmriqyaoxzcidlj2lhua5afcpe

Visual Reinforcement Learning with Imagined Goals [article]

Ashvin Nair, Vitchyr Pong, Murtaza Dalal, Shikhar Bahl, Steven Lin, Sergey Levine
2018 arXiv   pre-print
We also propose a retroactive goal relabeling scheme to further improve the sample-efficiency of our method.  ...  In this paper, we propose an algorithm that acquires such general-purpose skills by combining unsupervised representation learning and reinforcement learning of goal-conditioned policies.  ...  We would also like to thank Carlos Florensa for making multiple useful suggestions in later version of the draft.  ... 
arXiv:1807.04742v2 fatcat:65icyi2f6vctfjluyig6vqo5su

CLAMGen: Closed-Loop Arm Motion Generation via Multi-view Vision-Based RL [article]

Iretiayo Akinola, Zizhao Wang, Peter Allen
2021 arXiv   pre-print
Further more, we introduce novel learning objectives and techniques to improve 3D understanding from multiple image views and sample efficiency of our algorithm.  ...  values and residual actions learned from images to avoid obstacles.  ...  Our results show residual Q-learning significantly improves sample efficiency when learning from images as the base policy provides improved exploration.  ... 
arXiv:2103.13267v1 fatcat:zsacjshkzvanbk64sqsiwpn5lu

Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels [article]

Ilya Kostrikov, Denis Yarats, Rob Fergus
2021 arXiv   pre-print
We propose a simple data augmentation technique that can be applied to standard model-free reinforcement learning algorithms, enabling robust learning directly from pixels without the need for auxiliary  ...  Existing model-free approaches, such as Soft Actor-Critic (SAC), are not able to train deep networks effectively from image pixels.  ...  Finally, we would like to thank Ankesh Anand for helping us finding an error in our evaluation script for the Atari 100k benchmark experiments.  ... 
arXiv:2004.13649v4 fatcat:6dl4xjzzfzebbctpsdq2ktbl2a
« Previous Showing results 1 — 15 out of 69,982 results