A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Inferring Probabilistic Reward Machines from Non-Markovian Reward Processes for Reinforcement Learning
[article]
2022
arXiv
pre-print
The success of reinforcement learning in typical settings is predicated on Markovian assumptions on the reward signal by which an agent learns optimal policies. In recent years, the use of reward machines has relaxed this assumption by enabling a structured representation of non-Markovian rewards. In particular, such representations can be used to augment the state space of the underlying decision process, thereby facilitating non-Markovian reinforcement learning. However, these reward machines
arXiv:2107.04633v2
fatcat:k4apdkrl4bcnvn34fj7v4vpef4