Wasserstein Robust Reinforcement Learning [article]

Mohammed Amin Abdullah and Hang Ren and Haitham Bou Ammar and Vladimir Milenkovic and Rui Luo and Mingtian Zhang and Jun Wang
2019 arXiv   pre-print
Reinforcement learning algorithms, though successful, tend to over-fit to training environments hampering their application to the real-world. This paper proposes WR^2L– a robust reinforcement learning algorithm with significant robust performance on low and high-dimensional control tasks. Our method formalises robust reinforcement learning as a novel min-max game with a Wasserstein constraint for a correct and convergent solver. Apart from the formulation, we also propose an efficient and
more » ... ble solver following a novel zero-order optimisation method that we believe can be useful to numerical optimisation in general. We empirically demonstrate significant gains compared to standard and robust state-of-the-art algorithms on high-dimensional MuJuCo environments.
arXiv:1907.13196v4 fatcat:qeldcwoy6jbh7jnauw62acmxqu