Prediction-based search for autonomous game-playing [article]

Alexander Dockhorn, Universitäts- Und Landesbibliothek Sachsen-Anhalt, Martin-Luther Universität, Rudolf Kruse
2020
Simulation-based search algorithms have been widely applied in the context of autonomous game-playing. Their flexibility allows for the rapid development of agents that are able to achieve satisfying performance in many problem domains. However, these algorithms share two requirements, namely, access to a forward model and full knowledge of the environment's state. In this thesis, simulationbased search algorithms will be adapted to tasks in which either the forward model or the state of the
more » ... ironment is unknown. To play a game without a forward model, methods for learning the environment's model from recent interactions between the agent and the environment are proposed. These forward model learning techniques allow the agent to predict the outcome of its actions, and therefore, enable a prediction-based search process. An analysis of environment models shows how they can be represented and learned in the form of an end-to-end forward model. Based on this general approach, three methods are proposed which reduce the number of possible models and, thus, the training time required. The proposed forward model learning techniques are evaluated according to their applicability to general game-learning tasks and validated based on a wide variety of games. The results show the applicability of prediction-based search agents for games where the forward model is not accessible. In case the environment's state cannot be fully observed by the agent and the number of possible states is low, state determinisation methods, which uniformly sample possible states have shown to perform well. However, if the number of states is high, the uniform state sampling approach performs worse than nondeterminising search methods due to the search process spending too much time on unlikely states. In this thesis, two methods for predictive state determinisation are proposed. These sample probable states based on the agent's partial observation of the current state and a database of previously played games, which allows the agent to focus [...]
doi:10.25673/34014 fatcat:irqtgwindvazbka3uf2ryjgjle