Cyclic games and linear programming

Sergei Vorobyov
2008 Discrete Applied Mathematics  
New efficient algorithms for solving infinite-duration two-person adversary games with the decision problem in NP ∩ coNP, based on linear programming (LP), LP-representations, combinatorial LP, linear complementarity problem (LCP), controlled LP are surveyed. (Sections 4,6, and 8), providing new algorithms, often more efficient than previously known (e.g., subexponential; cf., Section 8). A systematic attempt at putting games into the linear programming framework appears fruitful and leads to
more » ... teresting controlled generalizations of standard combinatorial optimization problems, including longest-shortest paths (LSPs: Section 3) and controlled linear programming (Section 6). The latter provides a nice unifying formulation subsuming (being "hard" for) a class of games in NP ∩ coNP. Some well-known extensions of linear programming, like linear complementarity problem (LCP) [16, 45, 17] also appear inspiring and productive for games (Sections 5 and 7), and as a by-product yield new efficient algorithms in P-matrix linear complementarity theory. This paper 2 is dedicated to the memory of Leonid. He was the brightest mind I ever met. For him everything was either clearly impossible (because of nonconvexity, etc.) or trivial (including his own 1979 result). He knew surprisingly many things and enlightened me on (it was enough to mention something) numerous exotic topics without any preparation, repeatedly asserting (rather than asking) "you understand?", which left me, if not embarrassed, with lots of homework (I still have to do). We first met in the summer of 1998 at the Max-Planck Institut für Informatik (Saarbrücken, Germany) and since then tried to find new approaches to mean payoff and parity games. In 2004 I ventured to explain to him the (half-baked then) LSPs problem I was excited about at that time (Section 3), which eventually lead to a new algorithm for MPGs [13] . Later Leonid told me he knew it all along (even suggested it to Richard Karp), and retaliated with [55, 37, 36] , which provides for a generalization concerning blocking, and gives a polynomial bound, but is restricted to nonnegative edge weights only. This, unfortunately, is not enough to cover MPGs. When I suggested to generalize LSPs to other controlled optimization problems, like controlled maximum flow, Leonid first was enthusiastic, until we quickly realized that such problems were usually NP-hard. For that reason he did not believe in controlled linear programming (Section 6), which finally and surprisingly turned out not NP-hard in several practically important cases. There remain numerous promising ideas we tried (I still have to understand some), for which, unfortunately, we were not given enough time to develop. Preliminaries on cyclic games Mean payoff games An MPG is played on a finite directed edge-weighted graph G = (V , E, w), where the set of vertices V is partitioned into two nonempty subsets V MAX , V MIN , every vertex has at least one outgoing edge (i.e., there are no sinks or leaves), and w : E → R is the edge weight (or cost) function. 3 An MPG is a pair (G, v 0 ), where v 0 ∈ V is distinguished as the start vertex. Thus a game graph defines |V | games with different start vertices. Given an MPG (G, v 0 ), a play develops in the following way. Initially, a pebble is placed in the start vertex v 0 and players MAX and MIN begin constructing an infinite sequence of edges {(v i , v i+1 )} +∞ i=0 . If the pebble is in a vertex v i ∈ V MAX then MAX selects an outgoing edge from v i and moves the pebble to its destination vertex v i+1 ; otherwise MIN makes the analogous choice and move. A (general) strategy of a player is a rule of selecting successor vertices in a play, as a function of the whole history of the preceding play. Players MAX and MIN are adversaries, MAX wants to maximize (over all possible strategies), whereas MIN wants to minimize, the values MAX (G, v 0 ) and MIN (G, v 0 ) defined, respectively, as Ehrenfeucht and Mycielski [21]). The values of infinite and finite MPGs on the same graph starting in the same vertex, as well as optimal positional strategies, coincide. This is somewhat surprising, because intuitively, the knowledge of previously visited vertices in a finite play seems important for closing the cycle optimally. 4 Interestingly, the proofs of Theorems 2.1 and 2.2 in [21] are based on a subtle cyclic interplay between infinite and finite MPGs. Both versions are used in [21] in order to establish claims about any one of them. This cyclic dependency is eliminated in [9] , where everything reduces to finite MPGs. Straightforward algorithms In principle, the existence of optimal pure positional strategies (Theorem 2.1) immediately gives several straightforward methods to find the value of a vertex in an MPG. The first one consists in verifying all pairs of pure positional strategies for both players and finding a saddle point of the corresponding matrix (exponentially large in the number of vertices, in general). The second one is based on the following polynomial time decidability of one-player MPGs. Proposition 2.3. If one of the players fixes his pure positional strategy, an optimal counterstrategy of the opponent is polynomial time computable.
doi:10.1016/j.dam.2008.04.012 fatcat:5fmgypimu5cs3ahclgd4lsc4si