Game-Tree Search Using Proof Numbers: The First Twenty Years

Akihiro Kishimoto, Mark H.M. Winands, Martin Müller, Jahn-Takeshi Saito
2012 ICGA Journal  
Solving games is a challenging and attractive task in the domain of Artificial Intelligence. Despite enormous progress, solving increasingly difficult games or game positions continues to pose hard technical challenges. Over the last twenty years, algorithms based on the concept of proof and disproof numbers have become dominating techniques for game solving. Prominent examples include solving the game of checkers to be a draw, and developing checkmate solvers for shogi, which can find mates
more » ... t take over a thousand moves. This article provides an overview of the research on Proof-Number Search and its many variants and enhancements. P to solve, a solver must select a move that is provably best, while a tournament program only has to select a move that is probably best. In principle, given an unlimited amount of time, αβ can determine whether a position P of any finite game is a win or not, even on moderate hardware. For a given game position, assign a score of ∞ for a first player win, −∞ for a terminal position that is not a win, and possibly other, heuristic values which approximate the winning probability of the first player in the undecided positions. Then exploring a game tree rooted at P deeply enough with αβ will determine the score of P as either ∞ or −∞. In practice, due to its property of being essentially a fixed-depth search, αβ inherently suffers from the search tree growing exponentially with the search depth. This exponential growth limits the performance of αβ search, even with all the effective enhancements such as iterative deepening (Slate and Atkincan greatly improve the effective branching factor and the search depth reached, but do not change the basic fact of exponential growth. An implicit assumption in solvers based on αβ search is that proofs at shallow search depths are easier to find. However, many popular games such as Go-Moku or checkmating puzzles are characterized by solutions containing narrow but very deep lines of play. It is difficult to adjust αβ search to perform well in these cases. Proof-Number Search (PNS) (Allis, van der Meulen, and van den Herik, 1994) performs variable-depth search that has no explicit bounds on the search depth. The notion of proof and disproof numbers in PNS originates from McAllester's conspiracy numbers, which measure the reliability of the score in the minimax framework (McAllester, 1985; McAllester, 1988) . Like PNS, McAllester's conspiracy number search (CNS) performs variable-depth search by trying to expand an unreliable leaf node, as estimated by conspiracy numbers, in order to make the score of that node more reliable. Unlike CNS, PNS specializes conspiracy numbers to AND/OR tree search with binary (win/loss) outcomes. This specialization leads to a significantly reduced memory requirement of PNS compared to CNS. Additionally, in contrast to conspiracy numbers, proof and disproof numbers also estimate the difficulty of solving a node. As a result, PNS implements a "simplest-first" search paradigm, which can find small but potentially deep proofs efficiently.
doi:10.3233/icg-2012-35302 fatcat:whg346ql2vdbzdip3unujsgfya