Lower Bounds for Howard's Algorithm for Finding Minimum Mean-Cost Cycles [chapter]

Thomas Dueholm Hansen, Uri Zwick
2010 Lecture Notes in Computer Science  
Howard's policy iteration algorithm is one of the most widely used algorithms for finding optimal policies for controlling Markov Decision Processes (MDPs). When applied to weighted directed graphs, which may be viewed as Deterministic MDPs (DMDPs), Howard's algorithm can be used to find Minimum Mean-Cost cycles (MMCC). Experimental studies suggest that Howard's algorithm works extremely well in this context. The theoretical complexity of Howard's algorithm for finding MMCCs is a mystery. No
more » ... ynomial time bound is known on its running time. Prior to this work, there were only linear lower bounds on the number of iterations performed by Howard's algorithm. We provide the first weighted graphs on which Howard's algorithm performs Ω(n 2 ) iterations, where n is the number of vertices in the graph. 1 The MMCC problem is an interesting problem that has various applications. It generalizes the problem of finding a negative cost cycle in a graph. It is also used as a subroutine in algorithms for solving other problems, such as min-cost flow algorithms, (See, e.g., Goldberg and Tarjan [9] .) There are several polynomial time algorithms for solving the MMCC problem. Karp [12] gave an O(mn)-time algorithm for the problem, where m is the number of edges and n is the number of vertices in the input graph. Young et al. [18] gave an algorithm whose complexity is O(mn+n 2 log n). Although this is slightly worse, in some cases, than the running time of Karp's algorithm, the algorithm of Young et al. [18] behaves much better in practice. Dasdan [3] experimented with many different algorithms for the MMCC problem, including Howard's algorithm. He reports that Howard's algorithm usually runs much faster than Karp's algorithm, and is usually almost as fast as the algorithm of Young et al. [18]. A more thorough experimental study of MMCC algorithms was recently conducted by Georgiadis et al. [8] . 3 Understanding the complexity of Howard's algorithm for MMCCs is interesting from both the applied and theoretical points of view. Howard's algorithm for MMCC is an extremely simple and natural combinatorial algorithm, similar in flavor to the Bellman-Ford algorithm for finding shortest paths [1, 2],[6] and to Karp's [12] algorithm. Yet, its analysis seems to be elusive. Howard's algorithm also has the advantage that it can be applied to the more general problem of finding a cycle with a minimum cost-to-time ratio (see, e.g., Megiddo [14, 15]). Howard's algorithm works in iteration. Each iteration takes O(m) time. It is trivial to construct instances on which Howard's algorithm performs n iterations. (Recall that n and m are the number of vertices and edges in the input graph.) Madani [13] constructed instances on which the algorithm performs 2n − O(1) iterations. No graphs were known, however, on which Howard's algorithm performed more than a linear number of iterations. We construct the first graphs on which Howard's algorithm performs Ω(n 2 ) iterations, showing, in particular, that there are instances on which its running time is Ω(n 4 ), an order of magnitude slower than the running times of the algorithms of Karp [12] and Young et al. [18]. We also construct n-vertex outdegree-2 graphs on which Howard's algorithm performs 2n − O(1) iterations. (Madani's [13] examples used Θ(n 2 ) edges.) This example is interesting as it shows that the number of iterations performed may differ from the number of edges in the graph by only an additive constant. It also sheds some more light on the non-trivial, and perhaps non-intuitive behavior of Howard's algorithm. Our examples still leave open the possibility that the number of iterations performed by Howard's algorithm is always at most m, the number of edges. (The graphs on which the algorithm performs Ω(n 2 ) iterations also have Ω(n 2 ) edges.) We conjecture that this is always the case.
doi:10.1007/978-3-642-17517-6_37 fatcat:nv5ce77vpndkzar2ov2ywmpyje