A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is `application/pdf`

.

## Filters

##
###
The simplex method is strongly polynomial for deterministic Markov decision processes
[article]

2013
*
arXiv
*
pre-print

We prove that

arXiv:1208.5083v2
fatcat:qrvqltlxxbdg7jl63vw2sbkt6q
*the**simplex**method*with*the*highest gain/most-negative-reduced cost pivoting rule converges in*strongly**polynomial*time*for**deterministic**Markov**decision**processes*(MDPs) regardless of*the*...*For*a*deterministic*MDP with n states and m actions, we prove*the**simplex**method*runs in O(n^3m^2log^2 n) iterations if*the*discount factor*is*uniform and O(n^5m^3log^2 n) iterations if each action has ... Introduction*Markov**decision**processes*(MDPs) are a powerful tool*for*modeling repeated*decision*making in stochastic, dynamic environments. ...##
###
The simplex method is strongly polynomial for deterministic Markov decision processes
[chapter]

2013
*
Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms
*

We prove that

doi:10.1137/1.9781611973105.105
dblp:conf/soda/PostY13
fatcat:7zn2v5my5jhvrl3ufnekvsuauy
*the**simplex**method*with*the*highest gain/most-negative-reduced cost pivoting rule converges in*strongly**polynomial*time*for**deterministic**Markov**decision**processes*(MDPs) regardless of*the*...*For*a*deterministic*MDP with n states and m actions, we prove*the**simplex**method*runs in O(n 3 m 2 log 2 n) iterations if*the*discount factor*is*uniform and O(n 5 m 3 log 2 n) iterations if each action ... Introduction*Markov**decision**processes*(MDPs) are a powerful tool*for*modeling repeated*decision*making in stochastic, dynamic environments. ...##
###
The Simplex Method is Strongly Polynomial for Deterministic Markov Decision Processes

2015
*
Mathematics of Operations Research
*

We prove that

doi:10.1287/moor.2014.0699
fatcat:oerirap3nfbwrorlark5b3hcl4
*the**simplex**method*with*the*highest gain/most-negative-reduced cost pivoting rule converges in*strongly**polynomial*time*for**deterministic**Markov**decision**processes*(MDPs) regardless of*the*...*For*a*deterministic*MDP with n states and m actions, we prove*the**simplex**method*runs in O(n 3 m 2 log 2 n) iterations if*the*discount factor*is*uniform and O(n 5 m 3 log 2 n) iterations if each action ...*The*authors would like to thank Kazuhisa Makino*for*pointing out an error in Lemma 3.3. ...##
###
On the Complexity of Solving Markov Decision Problems
[article]

2013
*
arXiv
*
pre-print

*Markov*

*decision*problems (MDPs) provide

*the*foundations

*for*a number of problems of interest to AI researchers studying automated planning and reinforcement learning. ... To encourage future research, we sketch some alternative

*methods*of analysis that rely on

*the*structure of MDPs. ... Acknowledgments Thanks to Justin Boyan, Tony Cassandra, Anne Con don, Paul Dagum, Michael Jordan, Philip Klein, Hsueh-1 Lu, Walter Ludwig, Satinder Singh, John Tsitsiklis, and Marty Puterman

*for*pointers ...

##
###
The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

2011
*
Mathematics of Operations Research
*

We prove that

doi:10.1287/moor.1110.0516
fatcat:elehu5k54jewvcy3fozo6xzrgu
*the*classic policy-iteration*method*(Howard 1960) , including*the**Simplex**method*(Dantzig 1947) with*the*most-negative-reduced-cost pivoting rule,*is*a*strongly**polynomial*-time algorithm ... Furthermore,*the*computational complexity of*the*policyiteration*method*(including*the**Simplex**method*)*is*superior to that of*the*only known*strongly**polynomial*-time interior-point algorithm ([28] 2005 ... I thank Pete Veinott and four anonymous Referees*for*many insightful discussions and suggestions on this subject, which have greatly improved*the*presentation of*the*paper. ...##
###
Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming

2014
*
Operations Research Letters
*

Therefore any such algorithm

doi:10.1016/j.orl.2014.07.006
fatcat:vt3ycq33vnddlmuuzcvc7cpifq
*is*not*strongly**polynomial*. In particular,*the*modified policy iteration and λ-policy iteration algorithms are not*strongly**polynomial*. ... This note shows that*the*number of arithmetic operations required by any member of a broad class of optimistic policy iteration algorithms to solve a*deterministic*discounted dynamic programming problem ... Acknowledgment*The*research of*the*first two authors was partially supported by NSF Grant CMMI-1335296. ...##
###
Solving infinite-horizon POMDPs with memoryless stochastic policies in state-action space
[article]

2022
*
arXiv
*
pre-print

Reward optimization in fully observable

arXiv:2205.14098v1
fatcat:ovo7lx6hrjeufioqlilkphz4te
*Markov**decision**processes**is*equivalent to a linear program over*the*polytope of state-action frequencies. ... Taking a similar perspective in*the*case of partially observable*Markov**decision**processes*with memoryless stochastic policies,*the*problem was recently formulated as*the*optimization of a linear objective ... JM also acknowledges support from*the*International Max Planck Research School*for*Mathematics in*the*Sciences and*the*Evangelisches Studienwerk Villigst e.V.. ...##
###
Page 2222 of Mathematical Reviews Vol. , Issue 99c
[page]

1991
*
Mathematical Reviews
*

Shao Hui] (PRC-HKSTB; Kowloon)

*Markov**decision*programming*for**process*control in batch production. ... In this paper a*polynomial*time primal network*simplex*algorithm*for**the*minimum cost flow problem*is*developed. ...##
###
Comments on: Recent progress on the combinatorial diameter of polytopes and simplicial complexes

2013
*
TOP - An Official Journal of the Spanish Society of Statistics and Operations Research
*

I am also grateful to

doi:10.1007/s11750-013-0291-y
fatcat:klst3zgaeva6rmik6ddbdtvf4a
*the*Technische Universität München*for**the*hospitality received during*the*time of writing this article. ... I also want to thank*the*editors of this volume*for**the*invitation to contribute a commentary to this special issue. ... programs derived from*Markov**Decision**Processes*with Fixed Discount (which*is*not*the*setting*for**the*other papers, but*is*an important case of MDPs). ...##
###
Page 5394 of Mathematical Reviews Vol. , Issue 95i
[page]

1995
*
Mathematical Reviews
*

*The*basic idea behind

*the*numerical approximation

*methods*

*is*to build a discrete

*Markov*

*decision*

*process*with finite state space and finite control space which

*is*readily solvable and approximates

*the*... Summary: “This paper deals with numerical

*methods*

*for*

*the*optimization of piecewise, stationary and

*deterministic*systems. ...

##
###
Strong Polynomiality of the Value Iteration Algorithm for Computing Nearly Optimal Policies for Discounted Dynamic Programming
[article]

2020
*
arXiv
*
pre-print

This note provides upper bounds on

arXiv:2001.10174v1
fatcat:gss3vsncgvd3tcx2nu4xj2hjqa
*the*number of operations required to compute by value iterations a nearly optimal policy*for*an infinite-horizon discounted*Markov**decision**process*with a finite number ...*For*a given discount factor, magnitude of*the*reward function, and desired closeness to optimality, these upper bounds are*strongly**polynomial*in*the*number of state-action pairs, and one of*the*provided ... Introduction Value and policy iteration algorithms are*the*major tools*for*solving infinite-horizon discounted*Markov**decision**processes*(MDPs). ...##
###
The complexity of Policy Iteration is exponential for discounted Markov Decision Processes

2012
*
2012 IEEE 51st IEEE Conference on Decision and Control (CDC)
*

*The*question of knowing whether

*the*Policy Iteration algorithm (PI)

*for*solving stationary

*Markov*

*Decision*

*Processes*(MDPs) has exponential or (

*strongly*)

*polynomial*complexity has attracted much attention ... On

*the*other hand, it was shown that PI runs in

*strongly*

*polynomial*time on discounted-reward MDPs, yet only when

*the*discount factor

*is*fixed beforehand. ...

*Markov*

*Decision*

*Processes*can be solved in weakly

*polynomial*time using Linear Programming (LP) [16] . ...

##
###
A Linear Programming Approach to Nonstationary Infinite-Horizon Markov Decision Processes

2013
*
Operations Research
*

A new result by Ye [51] shows that Dantzig's original

doi:10.1287/opre.1120.1121
fatcat:3blutyfygfetfhud2i63zju2ga
*Simplex**method*with*the*most negative reduced cost pivoting rule [15]*is**strongly**polynomial**for*solving stationary MDPs. ... This complexity bound*is*better than*the**polynomial*performance of value iteration [49, 51] , and in fact,*is*superior to*the*only known*strongly**polynomial*time interior point algorithm [50]*for*solving ... Similar to Dantzig's*strongly**polynomial*time*Simplex**method*with*the*most negative reduced cost pivot rule*for*stationary MDPs, our infinite-dimensional*Simplex**method*uses a most negative approximate ...##
###
The existence of a strongly polynomial time simplex algorithm for linear programming problems
[article]

2022
*
arXiv
*
pre-print

It

arXiv:2006.11466v13
fatcat:xb5gazihwvex5b7f5kkblcsqbm
*is*well known how to clarify whether there*is*a*polynomial*time*simplex*algorithm*for*linear programming (LP)*is**the*most challenging open problem in optimization and discrete geometry. ... We show that there*is*a*simplex*algorithm whose number of pivoting steps does not exceed*the*number of variables of a LP problem. ... There has been recent interest in finding an algorithm like this*for**the**deterministic**Markov**decision**processes*[56] ,*the*generalized circulation problem [28] ,*the*maximum flow problem [2, 29] , ...##
###
Dantzig's pivoting rule for shortest paths, deterministic MDPs, and minimum cost to time ratio cycles
[chapter]

2013
*
Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms
*

Dantzig's pivoting rule

doi:10.1137/1.9781611973402.63
dblp:conf/soda/HansenKZ14
fatcat:7dbnw6vmtfegtfs5ggqsf3ksay
*is*one of*the*most studied pivoting rules*for**the**simplex*algorithm. ... This gives a*strongly**polynomial*time algorithm*for**the*problem that does not use Megiddo's parametric search technique. ... · w k Discounted,*deterministic**Markov**decision**processes*In this section we prove*the*following theorem which*is*essentially a generalization of Theorem 3.1. ...
« Previous

*Showing results 1 — 15 out of 716 results*