Filters








464 Hits in 7.1 sec

Strategy iteration is strongly polynomial for 2-player turn-based stochastic games with a constant discount factor [article]

Thomas Dueholm Hansen, Peter Bro Miltersen, Uri Zwick
2010 arXiv   pre-print
iteration algorithm used for solving 2-player turn-based stochastic games with discounted zero-sum rewards.  ...  Ye showed recently that the simplex method with Dantzig pivoting rule, as well as Howard's policy iteration algorithm, solve discounted Markov decision processes (MDPs), with a constant discount factor  ...  Concluding remarks We have shown that the strategy iteration algorithm is strongly polynomial for 2TBSGs with a fixed discount factor.  ... 
arXiv:1008.0530v1 fatcat:ewpjqi5cjzeazmkjrj54g2yj7q

Strategy Iteration Is Strongly Polynomial for 2-Player Turn-Based Stochastic Games with a Constant Discount Factor

Thomas Dueholm Hansen, Peter Bro Miltersen, Uri Zwick
2013 Journal of the ACM  
iteration algorithm used for solving 2-player turn-based stochastic games with discounted zero-sum rewards.  ...  Ye showed recently that the simplex method with Dantzig pivoting rule, as well as Howard's policy iteration algorithm, solve discounted Markov decision processes (MDPs), with a constant discount factor  ...  Concluding remarks We have shown that the strategy iteration algorithm is strongly polynomial for 2TBSGs with a fixed discount factor.  ... 
doi:10.1145/2432622.2432623 fatcat:ntouvh6v4jd25cn32bugrefwy4

Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity [article]

Aaron Sidford, Mengdi Wang, Lin F. Yang, Yinyu Ye
2019 arXiv   pre-print
Given a stochastic game with discount factor γ∈(0,1) we provide an algorithm that computes an ϵ-optimal strategy with high-probability given Õ((1 - γ)^-3ϵ^-2) samples from the transition function for each  ...  In this paper, we settle the sampling complexity of solving discounted two-player turn-based zero-sum stochastic games up to polylogarithmic factors.  ...  Computing an optimal strategy for a two-player turn-based zero-sum stochastic game is known to be in NP ∩ co-NP Condon (1992) .  ... 
arXiv:1908.11071v1 fatcat:wfsvl4q3dnh7lawgniqvw6hlju

Magnifying Lens Abstraction for Stochastic Games with Discounted and Long-run Average Objectives [article]

Krishnendu Chatterjee and Luca de Alfaro and Pritam Roy
2011 arXiv   pre-print
We consider turn-based stochastic games with two classical quantitative objectives: discounted-sum and long-run average objectives.  ...  We present the MLA technique based abstraction-refinement algorithm for stochastic games and MDPs with discounted-sum objectives.  ...  Given a turn-based stochastic game graph G, with a reward function r : S → R ≥0 and a discount factor 0 < β < 1, the following assertions hold. (Value iteration).  ... 
arXiv:1107.2132v1 fatcat:zem3oz6yifhpvpk6s6d3upbz74

Cyclic games and linear programming

Sergei Vorobyov
2008 Discrete Applied Mathematics  
The second one is based on the following polynomial time decidability of one-player MPGs. Proposition 2.3.  ...  If one of the players fixes his pure positional strategy, an optimal counterstrategy of the opponent is polynomial time computable.  ...  Acknowledgments Most ideas presented in this paper result from numerous discussions we had with Leonid since his visit to the Max-Planck Institut für Informatik, Saarbücken, in summer 1998, and during  ... 
doi:10.1016/j.dam.2008.04.012 fatcat:5fmgypimu5cs3ahclgd4lsc4si

Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming

Eugene A. Feinberg, Jefferson Huang, Bruno Scherrer
2014 Operations Research Letters  
Therefore any such algorithm is not strongly polynomial. In particular, the modified policy iteration and λ-policy iteration algorithms are not strongly polynomial.  ...  This note shows that the number of arithmetic operations required by any member of a broad class of optimistic policy iteration algorithms to solve a deterministic discounted dynamic programming problem  ...  algorithm for two-player turn-based zero-sum stochastic games.  ... 
doi:10.1016/j.orl.2014.07.006 fatcat:vt3ycq33vnddlmuuzcvc7cpifq

Polyhedral value iteration for discounted games and energy games [article]

Alexander Kozachinskiy
2020 arXiv   pre-print
We present a deterministic algorithm, solving discounted games with n nodes in n^O(1)· (2 + √(2))^n-time. For bipartite discounted games our algorithm runs in n^O(1)· 2^n-time.  ...  Our approach is heavily inspired by a recent algorithm of Dorfman et al. (ICALP 2019) for energy games. For completeness, we present their algorithm in terms of polyhedral value iteration.  ...  I am grateful to Pierre Ohlmann for giving a talk about [6] at the University of Warwick, and to Marcin Jurdzinski for discussions.  ... 
arXiv:2007.08575v2 fatcat:fdv66pzkbfhubdi7s34pajb3aq

A Strongly Polynomial Reduction for Linear Programs over Grids [article]

Lorenz Klaus
2015 arXiv   pre-print
We then consider two-player stochastic games with perfect information as a natural generalization of discounted MDPs.  ...  A strongly polynomial reduction from the games to their binary counterparts is obtained through a generalization of our reduction for Grid-LPs.  ...  This work is supported by the JST, ERATO Large Graph Project. We would like to thank Jan Foniok, Komei Fukuda, Naonori Kakimura, and Hanna Sumita for various contributions.  ... 
arXiv:1405.1827v7 fatcat:lb5uvz5mcbedlkqxiojsxrbh4e

Policy iteration algorithm for zero-sum multichain stochastic games with mean payoff and perfect information [article]

Marianne Akian and Jean Cochet-Terrasson and Sylvie Detournay and Stéphane Gaubert
2012 arXiv   pre-print
We develop here a policy iteration algorithm for zero-sum stochastic games with mean payoff, following an idea of two of the authors (Cochet-Terrasson and Gaubert, C. R. Math. Acad. Sci.  ...  We consider zero-sum stochastic games with finite state and action spaces, perfect information, mean payoff criteria, without any irreducibility assumption on the Markov chains associated to strategies  ...  The latter bound has been improved and generalized to zero-sum two player stochastic games with perfect information factor by Hansen, Miltersen and Zwick in [HMZ11] , again for a fixed discount factor  ... 
arXiv:1208.0446v1 fatcat:jetodylrcbgfhcrhaym664qbly

Equilibria, fixed points, and complexity classes

Mihalis Yannakakis
2009 Computer Science Review  
) Nash equilibria in normal form games with three (or more) players.  ...  It is not known whether these problems can be solved in polynomial time.  ...  As shown in [13] , simple stochastic games (which are 'turn-based' games, i.e. only one player moves at a time) can be reduced to Shapley's game.  ... 
doi:10.1016/j.cosrev.2009.03.004 fatcat:lhr6iqyodfgond7vjrochf44la

Termination Criteria for Solving Concurrent Safety and Reachability Games [article]

Krishnendu Chatterjee and Luca de Alfaro and Thomas A. Henzinger
2008 arXiv   pre-print
We present in this paper a strategy improvement algorithm for computing the value of a concurrent safety game, that is, the maximal probability with which player~1 can enforce the safety objective.  ...  Since a value iteration algorithm, or a strategy improvement algorithm for reachability games, can be used to approximate the same value from above, the combination of both algorithms yields a method for  ...  Theorem 6 of [19] was proved for value functions of discounted games with costs, even when the discount factor λ = 0.  ... 
arXiv:0809.4017v1 fatcat:5oumfjsjyzfhhmsvmp56zv7emu

Equilibria, Fixed Points, and Complexity Classes [article]

Mihalis Yannakakis
2008 arXiv   pre-print
Nash equilibria in normal form games with 3 (or more) players.  ...  It is not known whether these problems can be solved in polynomial time.  ...  (Another standard equivalent formulation is as a discounted game, where the game does not stop but future rewards are discounted by a factor 1 − q per step).  ... 
arXiv:0802.2831v1 fatcat:nxsafbsw5jbo7khx47qwdmzotu

Algorithms for Parity Games [chapter]

Hartmut Klauck
2002 Lecture Notes in Computer Science  
games with finite arenas and other two-player games.  ...  If the highest priority of a vertex occurring infinitely often in the play is odd, then Player 1 wins, otherwise Player 0 wins. See Chapter 2 for more details. Exercise 7.1.  ...  The reduction from parity games to simple stochastic games that results increases the game arena only by a constant factor.  ... 
doi:10.1007/3-540-36387-4_7 fatcat:2xu24otqbzarfpfetg5574blza

Definable Zero-Sum Stochastic Games

Jérôme Bolte, Stéphane Gaubert, Guillaume Vigeral
2015 Mathematics of Operations Research  
As particular cases of our main results, we obtain that stochastic games with polynomial transitions, definable games with finite actions on one side, definable games with perfect information or switching  ...  We prove that the Shapley operator of any definable stochastic game with separable transition and reward functions is definable in the same structure.  ...  Venel for their very useful comments.  ... 
doi:10.1287/moor.2014.0666 fatcat:draa2ywu6zb77mtrmg56lz3rgi

A combinatorial strongly subexponential strategy improvement algorithm for mean payoff games

Henrik Björklund, Sergei Vorobyov
2007 Discrete Applied Mathematics  
It is based on iteratively improving the longest shortest distances to a sink in a possibly cyclic directed graph. We identify a new "controlled" version of the shortest paths problem.  ...  We suggest the first strongly subexponential and purely combinatorial algorithm for solving the mean payoff games problem.  ...  Acknowledgments We are grateful to Leonid Khachiyan, Vladimir Gurvich, and Endre Boros for inspiring discussions and illuminating ideas. We thank DIMACS for providing a creative working environment.  ... 
doi:10.1016/j.dam.2006.04.029 fatcat:6ecob37j55dz7mo4hn43fg6cse
« Previous Showing results 1 — 15 out of 464 results