A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
The simplex method is strongly polynomial for deterministic Markov decision processes
[chapter]
2013
Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms
We prove that the simplex method with the highest gain/most-negative-reduced cost pivoting rule converges in strongly polynomial time for deterministic Markov decision processes (MDPs) regardless of the discount factor. For a deterministic MDP with n states and m actions, we prove the simplex method runs in O(n 3 m 2 log 2 n) iterations if the discount factor is uniform and O(n 5 m 3 log 2 n) iterations if each action has a distinct discount factor. Previously the simplex method was known to
doi:10.1137/1.9781611973105.105
dblp:conf/soda/PostY13
fatcat:7zn2v5my5jhvrl3ufnekvsuauy