116 Hits in 7.0 sec

Temporal Difference Learning of an Othello Evaluation Function for a Small Neural Network with Shared Weights

Edward P. Manning
2007 2007 IEEE Symposium on Computational Intelligence and Games  
This paper presents an artificial neural network with shared weights, trained to play the game of Othello by selfplay with Temporal Difference Learning (TDL).  ...  The network performs as well as the champion of the CEC 2006 Othello Evaluation Function Competition. The TDL-trained network contains only 67 unique weights compared to 2113 for the champion.  ...  Hidden unit bias: 0.046755 Input-to-hidden weights: Hidden unit bias: 0.074982  ... 
doi:10.1109/cig.2007.368101 dblp:conf/cig/Manning07 fatcat:dbrz4eommbcdregxpx3xo2glrq

A Study of Artificial Neural Network Architectures for Othello Evaluation Functions

Kevin J. Binkley, Ken Seehart, Masafumi Hagiwara
2007 Transactions of the Japanese society for artificial intelligence  
of 20 different artificial neural network (ANN) architectures to learn othello game board evaluation functions.  ...  keywords: artificial neural network, temporal difference learning, reinforcement learning, board games, othello Summary In this study, we use temporal difference learning (TDL) to investigate the ability  ...  The evaluation function was an artificial neural network (ANN) using a straightforward board encoding and trained through temporal difference learning (TDL) [Sutton 88, Sutton 98 ].  ... 
doi:10.1527/tjsai.22.461 fatcat:t4eospe4fvcdxbf3ttzqhb52bm

Coevolutionary Temporal Difference Learning for Othello

Marcin Szubert, Wojciech Jaskowski, Krzysztof Krawiec
2009 2009 IEEE Symposium on Computational Intelligence and Games  
coevolution with temporal difference learning.  ...  This paper presents Coevolutionary Temporal Difference Learning (CTDL), a novel way of hybridizing coevolutionary search with reinforcement learning that works by interlacing one-population competitive  ...  of Poland.  ... 
doi:10.1109/cig.2009.5286486 dblp:conf/cig/SzubertJK09 fatcat:2byzeqgxzbb33ju7fhbekwlli4

Ensemble approaches in evolutionary game strategies: A case study in Othello

Kyung-Joong Kim, Sung-Bae Cho
2008 2008 IEEE Symposium On Computational Intelligence and Games  
The ensemble approach is tested on the Othello game with a weight piece counter representation.  ...  Additionally, the computational cost of an exhaustive search for the selective ensemble is reduced by introducing multi-stage evaluations.  ...  Chong et al. use evolutionary algorithms to learn spatial neural networks as an evaluation function for board configuration of Othello [17] .  ... 
doi:10.1109/cig.2008.5035642 dblp:conf/cig/KimC08 fatcat:sabx4zunybbonhulzekav5hzhy

Learning to Play Othello with Deep Neural Networks

Pawel Liskowski, Wojciech M. Jaskowski, Krzysztof Krawiec
2018 IEEE Transactions on Games  
In this paper, we verify whether CNN-based move predictors prove effective for Othello, a game with significantly different characteristics, including a much smaller board size and complete lack of translational  ...  The empirical evaluation confirms high capabilities of neural move predictors and suggests a strong correlation between prediction accuracy and playing strength.  ...  The suite consists of players with board evaluation functions encoded by weighted piece counter and n-tuple networks, trained by different methods including hand-design, temporal difference learning, evolution  ... 
doi:10.1109/tg.2018.2799997 fatcat:vj77jdqn7nhvte2wp574yatvwq

Predicting expert moves in the game of Othello using fully convolutional neural networks

Hlynur Davíð Hlynsson
2017 Figshare  
The main result is that using a raw board state representation, an 11-layer convolutional neural network can be trained to achieve 57.4% prediction accuracy on a test set, surpassing previous state of  ...  Careful feature engineering is an important factor of artificial intelligence for games.  ...  Artificial neural networks for Othello Binkley et al. investigated different artificial neural network (ANN) architectures for Othello using temporal difference learning.  ... 
doi:10.6084/m9.figshare.5326573 fatcat:eo7zqx2oebcdpak5llymtfyvhy

Analysis of Hyper-Parameters for Small Games: Iterations or Epochs in Self-Play? [article]

Hui Wang, Michael Emmerich, Mike Preuss, Aske Plaat
2020 arXiv   pre-print
In self-play, Monte Carlo Tree Search is used to train a deep neural network, that is then used in tree searches.  ...  A secondary result of our experiments concerns the choice of optimization goals, for which we also provide recommendations.  ...  Chong et al. described the evolution of neural networks for learning to play Othello [25] . Thill [29] .  ... 
arXiv:2003.05988v1 fatcat:y7mtudj3q5anbesfnviwtxd3pq

A Coevolutionary Model for The Virus Game

P.I. Cowling, M.H. Naveed, M.A. Hossain
2006 2006 IEEE Symposium on Computational Intelligence and Games  
In this paper, coevolution is used to evolve Artificial Neural Networks (ANN) which evaluate board positions of a two player zero-sum game (The Virus Game).  ...  The coevolved neural networks play at a level that beats a group of strong hand-crafted AI players.  ...  The weights of a deterministic evaluation function are evolved using a co-adapted GA with explicit fitness sharing.  ... 
doi:10.1109/cig.2006.311680 dblp:conf/cig/CowlingNH06 fatcat:ww4ulouaufdu7ljclyahnrpewe

Automatic Generation of Evaluation Features for Computer Game Players

Makoto Miwa, Daisaku Yokoyama, Takashi Chikayama
2007 2007 IEEE Symposium on Computational Intelligence and Games  
Evaluation functions are usually constructed manually as a weighted linear combination of evaluation features that characterize game positions.  ...  Accuracy of evaluation functions is one of the critical factors in computer game players.  ...  Once game features are generated, they can be automatically weighted to form an evaluation function through a variety of successful methods, including those based on neural networks, temporal difference  ... 
doi:10.1109/cig.2007.368108 dblp:conf/cig/MiwaYC07 fatcat:u5caqv4mdrejxktrr4z7q7ehwq

Mastering 2048 with Delayed Temporal Coherence Learning, Multi-Stage Weight Promotion, Redundant Encoding and Carousel Shaping [article]

Wojciech Jaśkowski
2016 arXiv   pre-print
With the aim to develop a strong 2048 playing program, we employ temporal difference learning with systematic n-tuple networks.  ...  We show that this basic method can be significantly improved with temporal coherence learning, multi-stage function approximator with weight promotion, carousel shaping, and redundant encoding.  ...  Acknowledgments The author thank Marcin Szubert for his comments to the manuscript and Adam Szczepański for implementing an efficient C++ version of the 2048 agent.  ... 
arXiv:1604.05085v3 fatcat:ykrkoiibondldmefeivq52itri

AlphaZero-Inspired General Board Game Learning and Playing [article]

Johannes Scheiermann, Wolfgang Konen
2022 arXiv   pre-print
In this paper, we pick an important element of AlphaZero - the Monte Carlo Tree Search (MCTS) planning stage - and combine it with reinforcement learning (RL) agents.  ...  Recently, the seminal algorithms AlphaGo and AlphaZero have started a new era in game learning and deep reinforcement learning.  ...  Other function approximation networks (deep neural networks or other) could be used as well in AlphaZero-inspired reinforcement learning, but n-tuple networks have the advantage that they can be trained  ... 
arXiv:2204.13307v1 fatcat:v2u4b3pylnhknn3ros4nullxsm

Coevolutionary Temporal Difference Learning for small-board Go

Krzysztof Krawiec, Marcin Szubert
2010 IEEE Congress on Evolutionary Computation  
In this paper we apply Coevolutionary Temporal Difference Learning (CTDL), a hybrid of coevolutionary search and reinforcement learning proposed in our former study, to evolve strategies for playing the  ...  CTDL works by interlacing exploration of the search space provided by one-population competitive coevolution and exploitation by means of temporal difference learning.  ...  ACKNOWLEDGMENTS This work was supported in part by Ministry of Science and Higher Education grant # N N519 3505 33.  ... 
doi:10.1109/cec.2010.5586054 dblp:conf/cec/KrawiecS10 fatcat:qi65gddgungrxbtbxu2nte3bj4

Evolving small-board Go players using coevolutionary temporal difference learning with archives

Krzysztof Krawiec, Wojciech Jaśkowski, Marcin Szubert
2011 International Journal of Applied Mathematics and Computer Science  
Evolving small-board Go players using coevolutionary temporal difference learning with archives We apply Coevolutionary Temporal Difference Learning (CTDL) to learn small-board Go strategies represented  ...  Intra-game learning is driven by gradient-descent Temporal Difference Learning (TDL), a reinforcement learning method that updates the board evaluation function according to differences observed between  ...  Acknowledgment This work has been supported by the Polish Ministry of Science and Higher Education under the grant no. N N519 441939.  ... 
doi:10.2478/v10006-011-0057-3 fatcat:uv6tkgqbbfaulblu3jv5da2yp4

Evolutionary computation and games

S.M. Lucas, G. Kendall
2006 IEEE Computational Intelligence Magazine  
Blondie24 utilizes an artificial neural network with 5,046 weights, which are evolved by an evolutionary strategy.  ...  Natural evolution can be considered to be a game in which the rewards for an organism that plays a good game of life are the propagation of its genetic material to its successors and its continued survival  ...  Acknowledgments We thank Thomas Runarsson for discussions related to this article.  ... 
doi:10.1109/mci.2006.1597057 fatcat:6o7yidxnirftvarbpx2fcxqh44

Mastering the game of Go with deep neural networks and tree search

David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe (+8 others)
2016 Nature  
These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play.  ...  We also introduce a new search algorithm that combines Monte-Carlo simulation with value and policy networks.  ...  Acknowledgements We thank Fan Hui for agreeing to play against AlphaGo; Toby Manning for refereeing the match; Ostrovski for reviewing the paper; and the rest of the DeepMind team for their support, ideas  ... 
doi:10.1038/nature16961 pmid:26819042 fatcat:hhxixsirtjairjuwqivmi3gcga
« Previous Showing results 1 — 15 out of 116 results