A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
First Passage Optimality and Variance Minimisation of Markov Decision Processes with Varying Discount Factors
2015
Journal of Applied Probability
This paper deals with the first passage optimality and variance minimisation problems of discrete-time Markov decision processes (MDPs) with varying discount factors and unbounded rewards/costs. First, under suitable conditions slightly weaker than those in the previous literature on the standard (infinite horizon) discounted MDPs, we establish the existence and characterisation of the first passage expected-optimal stationary policies. Second, to further distinguish the expected-optimal
doi:10.1239/jap/1437658608
fatcat:743sk7rysrerld3362bozuwphi