A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2018; you can also visit the original URL.
The file type is application/pdf
.
Learning Optimal Policies in Markov Decision Processes with Value Function Discovery?
2015
Performance Evaluation Review
In this paper we describe recent progress in our work on Value Function Discovery (vfd), a novel method for discovery of value functions for Markov Decision Processes (mdps). In a previous paper we described how vfd discovers algebraic descriptions of value functions (and the corresponding policies) using ideas from the Evolutionary Algorithm field. A special feature of vfd is that the descriptions include the model parameters of the mdp. We extend that work and show how additional information
doi:10.1145/2825236.2825239
fatcat:frnnhlofcnhzdcdtejcnqh2fsq