A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation
[article]
2021
arXiv
pre-print
Marginalized importance sampling (MIS), which measures the density ratio between the state-action occupancy of a target policy and that of a sampling distribution, is a promising approach for off-policy evaluation. However, current state-of-the-art MIS methods rely on complex optimization tricks and succeed mostly on simple toy problems. We bridge the gap between MIS and deep reinforcement learning by observing that the density ratio can be computed from the successor representation of the
arXiv:2106.06854v1
fatcat:4z7eaqfh6rfkjao3z2qip34n4i