A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
RAMBO-RL: Robust Adversarial Model-Based Offline Reinforcement Learning
[article]
2022
arXiv
pre-print
Offline reinforcement learning (RL) aims to find performant policies from logged data without further environment interaction. Model-based algorithms, which learn a model of the environment from the dataset and perform conservative policy optimisation within that model, have emerged as a promising approach to this problem. In this work, we present Robust Adversarial Model-Based Offline RL (RAMBO), a novel approach to model-based offline RL. We formulate the problem as a two-player zero sum game
arXiv:2204.12581v3
fatcat:cpeo6dzeqndtze6iejy7xydvca