A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial Imitation Learning
[article]
2022
arXiv
pre-print
Despite the recent success of reinforcement learning in various domains, these approaches remain, for the most part, deterringly sensitive to hyper-parameters and are often riddled with essential engineering feats allowing their success. We consider the case of off-policy generative adversarial imitation learning, and perform an in-depth review, qualitative and quantitative, of the method. We show that forcing the learned reward function to be local Lipschitz-continuous is a sine qua non
arXiv:2006.16785v3
fatcat:vtb6fvqrqbf35hnbyzob3utz2u