A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Conservative Data Sharing for Multi-Task Offline Reinforcement Learning
[article]
2021
arXiv
pre-print
To address this challenge, we develop a simple technique for data-sharing in multi-task offline RL that routes data based on the improvement over the task-specific data. ...
However, sharing data across all tasks in multi-task offline RL performs surprisingly poorly in practice. ...
CF is a CIFAR Fellow in the Learning in Machines and Brains program. ...
arXiv:2109.08128v1
fatcat:mdtzyivqmvdepkv6s3v6ycwpii
How to Leverage Unlabeled Data in Offline Reinforcement Learning
[article]
2022
arXiv
pre-print
Offline reinforcement learning (RL) can learn control policies from static datasets but, like standard RL methods, it requires reward annotations for every transition. ...
How can we best leverage such unlabeled data in offline RL? One natural solution is to learn a reward function from the labeled data and use it to label the unlabeled data. ...
Introduction Offline reinforcement learning (RL) provides the promise of a fully data-driven framework for learning performant policies. ...
arXiv:2202.01741v2
fatcat:qdt63dnmtneivdlvzhni64l7lq
Offline Distillation for Robot Lifelong Learning with Imbalanced Experience
[article]
2022
arXiv
pre-print
We show that the Offline Distillation Pipeline achieves better performance across all the encountered environments without affecting data collection. ...
We investigate two challenges in such a lifelong learning setting: first, existing off-policy algorithms struggle with the trade-off between being conservative to maintain good performance in the old environment ...
While most of the work in offline RL focuses on one task, Yu et al. (2021) studies multi-task offline RL with the goal of improving single task performance by selectively sharing the data across tasks ...
arXiv:2204.05893v1
fatcat:6jlcaw3ddfhfdblymlnghytaqi
TiKick: Towards Playing Multi-agent Football Full Games from Single-agent Demonstrations
[article]
2021
arXiv
pre-print
We then developed a distributed learning system and new offline algorithms to learn a powerful multi-agent AI from the fixed single-agent dataset. ...
Deep reinforcement learning (DRL) has achieved super-human performance on complex video games (e.g., StarCraft II and Dota II). ...
Baselines: CQL: CQL is an offline reinforcement learning algorithm that tries to learn conservative Q-values via adding penalties on the Q-functions. ...
arXiv:2110.04507v5
fatcat:h2u4yhlpvjg6jfikjtefzz4v5y
What Matters in Learning from Offline Human Demonstrations for Robot Manipulation
[article]
2021
arXiv
pre-print
We also highlight opportunities for learning from human datasets, such as the ability to learn proficient policies on challenging, multi-stage tasks beyond the scope of current reinforcement learning methods ...
Our study analyzes the most critical challenges when learning from offline human data for manipulation. ...
Acknowledgments We would like to thank Albert Tung for helping with the RoboTurk data collection system, Jim Fan for providing timely lab cluster support, and Helen Roman for helping order items for the ...
arXiv:2108.03298v2
fatcat:p7fmn5qjynftbjaymffoifpf5y
Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning
[article]
2021
arXiv
pre-print
Learning from datasets without interaction with environments (Offline Learning) is an essential step to apply Reinforcement Learning (RL) algorithms in real-world scenarios. ...
the dataset for value estimation. ...
Conclusion In this work, we demonstrate a critical problem in multi-agent off-policy reinforcement learning with finite data, where it introduces accumulated extrapolation error in the number of agents ...
arXiv:2106.03400v2
fatcat:fxxuhcneafe7beuiypglcu6o5e
Model-Based Offline Meta-Reinforcement Learning with Regularization
[article]
2022
arXiv
pre-print
Existing offline reinforcement learning (RL) methods face a few major challenges, particularly the distributional shift between the learned policy and the behavior policy. ...
Motivated by such empirical analysis, we explore model-based offline Meta-RL with regularized Policy Optimization (MerPO), which learns a meta-model for efficient task structure inference and an informative ...
Conservative data sharing for multi-task offline reinforcement learning. arXiv preprint arXiv:2109.08128, 2021a. ...
arXiv:2202.02929v1
fatcat:mfj35r2zazbnzpklwkvy2dw44u
Offline Reinforcement Learning with Reverse Model-based Imagination
[article]
2021
arXiv
pre-print
These reverse imaginations provide informed data augmentation for model-free policy learning and enable conservative generalization beyond the offline dataset. ...
In offline reinforcement learning (offline RL), one of the main challenges is to deal with the distributional shift between the learning policy and the given dataset. ...
Acknowledgments and Disclosure of Funding The authors would like to thank the anonymous reviewers, Zhizhou Ren, Kun Xu, and Hang Su for valuable and insightful discussions and helpful suggestions. ...
arXiv:2110.00188v2
fatcat:n5k3r3kwknglxg6w2lz3tcyhae
Offline Decentralized Multi-Agent Reinforcement Learning
[article]
2021
arXiv
pre-print
In many real-world multi-agent cooperative tasks, due to high cost and risk, agents cannot interact with the environment and collect experiences during learning, but have to learn from offline datasets ...
Mathematically, we prove the convergence of Q-learning under the non-stationary transition probabilities after modification. ...
Conclusion and Discussion In this paper, we proposed MABCQ for offline and fully decentralized multi-agent reinforcement learning. ...
arXiv:2108.01832v1
fatcat:gtyeobjxgjgrldrfnoy7f4kt7u
Multi-Task Conditional Imitation Learning for Autonomous Navigation at Crowded Intersections
[article]
2022
arXiv
pre-print
A multi-task conditional imitation learning framework is proposed to adapt both lateral and longitudinal control tasks for safe and efficient interaction. ...
In recent years, great efforts have been devoted to deep imitation learning for autonomous driving control, where raw sensory inputs are directly mapped to control actions. ...
Multi-task Learning in Computer Vision Multi-task learning [28] , [29] aims to improve learning efficiency by learning multiple complimentary tasks from shared representations. ...
arXiv:2202.10124v1
fatcat:xzhprx6hwzh67ekv5eokx7ddqu
Behavioral Priors and Dynamics Models: Improving Performance and Domain Transfer in Offline RL
[article]
2021
arXiv
pre-print
Offline Reinforcement Learning (RL) aims to extract near-optimal policies from imperfect offline data without additional environment interactions. ...
We investigate how to improve the performance of offline RL algorithms, its robustness to the quality of offline data, as well as its generalization capabilities. ...
The authors thank Kevin Lu and Justin Fu for help with setting up the D4RL benchmark tasks. ...
arXiv:2106.09119v2
fatcat:clvqausk3fcnfdp7i6rpgzapde
Sample-Efficient Reinforcement Learning via Conservative Model-Based Actor-Critic
[article]
2021
arXiv
pre-print
Model-based reinforcement learning algorithms, which aim to learn a model of the environment to make decisions, are more sample efficient than their model-free counterparts. ...
To tackle this problem, we propose the conservative model-based actor-critic (CMBAC), a novel approach that achieves high sample efficiency without the strong reliance on accurate learned models. ...
Acknowledgements We would like to thank all the anonymous reviewers for their insightful comments. ...
arXiv:2112.10504v1
fatcat:o6gzj3gbqjdqbnoxt7byfsdzp4
Reinforcement Learning in Practice: Opportunities and Challenges
[article]
2022
arXiv
pre-print
In this article, we first give a brief introduction to reinforcement learning (RL), and its relationship with deep learning, machine learning and AI. ...
Then we discuss challenges, in particular, 1) foundation, 2) representation, 3) reward, 4) exploration, 5) model, simulation, planning, and benchmarks, 6) off-policy/offline learning, 7) learning to learn ...
Examples in this category include methods for transfer learning, multi-task learning, and few-shot learning. ...
arXiv:2202.11296v2
fatcat:xdtsmme22rfpfn6rgfotcspnhy
A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems
[article]
2022
arXiv
pre-print
With the widespread adoption of deep learning, reinforcement learning (RL) has experienced a dramatic increase in popularity, scaling to previously intractable problems, such as playing complex games from ...
Effective offline RL algorithms have a much wider range of applications than online RL, being particularly appealing for real-world applications such as education, healthcare, and robotics. ...
Index Terms-Deep learning, reinforcement learning, offline reinforcement learning, batch reinforcement learning I. ...
arXiv:2203.01387v2
fatcat:euobvze7kre3fi7blalnbbgefm
Offline Reinforcement Learning from Images with Latent Space Models
[article]
2020
arXiv
pre-print
Offline reinforcement learning (RL) refers to the problem of learning policies from a static dataset of environment interactions. ...
However, the ability to learn directly from rich observation spaces like images is critical for real-world applications such as robotics. ...
Acknowledgments We want to thank Suraj Nair for sharing the BEE dataset with us and his help with setting up the Panda drawer environment. ...
arXiv:2012.11547v1
fatcat:krdth53gm5d2ddnmrbygr3atgu
« Previous
Showing results 1 — 15 out of 4,076 results