The Internet Archive has a preservation copy of this work in our general collections.
The file type is application/pdf
.
Decision Making Agent Searching for Markov Models in Near-Deterministic World
[article]
2011
arXiv
pre-print
Reinforcement learning has solid foundations, but becomes inefficient in partially observed (non-Markovian) environments. Thus, a learning agent -born with a representation and a policy- might wish to investigate to what extent the Markov property holds. We propose a learning architecture that utilizes combinatorial policy optimization to overcome non-Markovity and to develop efficient behaviors, which are easy to inherit, tests the Markov property of the behavioral states, and corrects against
arXiv:1102.5561v2
fatcat:nmbhk2fqcfa4pbjdlvmeebwh5y