A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is
Learning from a Learner
International Conference on Machine Learning
In this paper, we propose a novel setting for Inverse Reinforcement Learning (IRL), namely "Learning from a Learner" (LfL). As opposed to standard IRL, it does not consist in learning a reward by observing an optimal agent, but from observations of another learning (and thus suboptimal) agent. To do so, we leverage the fact that the observed agent's policy is assumed to improve over time. The ultimate goal of this approach is to recover the actual environment's reward and to allow the observerdblp:conf/icml/JacqGPP19 fatcat:t6erqmca25gy3mz6rm2lv4y2du