39,783 Hits in 4.6 sec

Playtesting in Match 3 Game Using Strategic Plays via Reinforcement Learning

Yuchu Shin, Jaewon Kim, Kyohoon Jin, Youngbin Kim
2020 IEEE Access  
Five strategic plays were defined in the Match 3 game under consideration and game playtesting was performed for each situation via reinforcement learning.  ...  INDEX TERMS Actor-critic, agent, artificial intelligence, game mission, game strategy, match 3, playtesting, reinforcement learning.  ...  This study defines five strategic plays that are commonly used in missions, as presented in Fig. 1 and the reinforcement learning is performed based on the listed plays.  ... 
doi:10.1109/access.2020.2980380 fatcat:oszzdmtkkngv3gm7sxorgpmjmu

Learning to make better strategic decisions

Eric Cardella
2012 Journal of Economic Behavior and Organization  
This paper experimentally investigates how the decision making quality of an agent's opponent impacts learning-by-doing (LBD) and learning-byobserving (LBO) in a 2-player strategic game.  ...  I consider an experimental design that enables me to measure strategic decision making quality, and control the decision making quality of an agent's opponent.  ...  To shed light on these questions, I propose a stylized experimental design, described in detail in the following section, that uses a 2-player, sequential-move game which features a dominant strategy.  ... 
doi:10.1016/j.jebo.2012.04.011 fatcat:r7zkf2kjdzhwjhttc27jjrek4y

AI in Games: Techniques, Challenges and Opportunities [article]

Qiyue Yin, Jun Yang, Wancheng Ni, Bin Liang, Kaiqi Huang
2021 arXiv   pre-print
In this paper, we survey recent successful game AIs, covering board game AIs, card game AIs, first-person shooting game AIs and real time strategy game AIs.  ...  professional level AIs; 3) raise the challenges or drawbacks in the current AIs for intelligent decision making; and 4) try to propose future trends in the games and intelligent decision making techniques  ...  However, computation of fictitious self-play for complex game is high, so researchers develop various self-play strategies, and uses distributed reinforcement learning to learn each generation of agents  ... 
arXiv:2111.07631v1 fatcat:g4sbl6v73rg4jdijj4qfi3eusq

Bush‐Mosteller learning for a zero-sum repeated game with random pay-offs

Alexander S. Poznyak, Kaddour Najim
2001 International Journal of Systems Science  
The suggested study is based on the learning automata paradigm and a limiting average reward criterion is tackled to analyse the arising Nash equilibrium.  ...  The analysis of the convergerre (adaptation) us well as the convergence rate (rute of adaptation) are presented and the optimal design parumetcrs of this adaptive procedure are derived.  ...  Acknowledgements The authors are grateful to the anonymous reviewers for their helpful comments and advice.  ... 
doi:10.1080/00207720110042347 fatcat:t7tti6bofnbfzj52iv6vfr54lq

Hierarchical Reinforcement Learning for Multi-agent MOBA Game [article]

Zhijian Zhang, Haozheng Li, Luo Zhang, Tianyin Zheng, Ting Zhang, Xiong Hao, Xiaoxin Chen, Min Chen, Fangxu Xiao, Wei Zhou
2019 arXiv   pre-print
The novelty of this work are: (1) proposing a hierarchical framework, where agents execute macro strategies by imitation learning and carry out micromanipulations through reinforcement learning, (2) developing  ...  a simple self-learning method to get better sample efficiency for training, and (3) designing a dense reward function for multi-agent cooperation in the absence of game engine or Application Programming  ...  Reward Design and Self-learning Reward Design Reward function plays a significant role in reinforcement learning, and good learning results of an agent are mainly depend on diverse rewards.  ... 
arXiv:1901.08004v6 fatcat:fiotznnqlfctldreiwepjay6iu

Learning through reinforcement for N-person repeated constrained games

A.S. Poznyak, K. Najim
2002 IEEE Transactions on Systems Man and Cybernetics Part B (Cybernetics)  
The suggested adaptive strategy which uses only the current realizations (outcomes and constraints) of the game, is based on the Bush-Mosteller reinforcement scheme in connection with a normalization procedure  ...  Simulation results illustrate the feasibility and the performance of this adaptive strategy. Index Terms-Adaptive strategies, learning automata (LA), reinforcement learning, repeated game.  ...  Gomez Ramirez for the discussion and his help in the simulation process, as well as the anonymous reviewers for their helpful comments and advice.  ... 
doi:10.1109/tsmcb.2002.1049610 pmid:18244882 fatcat:o5htr3z4q5bnndgohy2ub3tqgu

Superstition in the Network: Deep Reinforcement Learning Plays Deceptive Games [article]

Philip Bontrager, Ahmed Khalifa, Damien Anderson, Matthew Stephenson, Christoph Salge, Julian Togelius
2019 arXiv   pre-print
Deep reinforcement learning has learned to play many games well, but failed on others.  ...  To better characterize the modes and reasons of failure of deep reinforcement learners, we test the widely used Asynchronous Actor-Critic (A2C) algorithm on four deceptive games, which are specially designed  ...  Reinforcement Learning To test if these games are capable of deceiving an agent trained via reinforcement learning, we use Advantage Actor-Critic (A2C) to learn to play the games (Mnih et al. 2016 ).  ... 
arXiv:1908.04436v1 fatcat:elbspyxhrjeffkid3wz7tuhqvy

Machine Discovery of Comprehensible Strategies for Simple Games Using Meta-interpretive Learning

Stephen H. Muggleton, Celine Hocquette
2019 New generation computing  
We use these games to compare Cumulative Minimax Regret for variants of both standard and deep reinforcement learning against two variants of a new Meta-interpretive Learning system called MIGO.  ...  In this paper, we consider Machine Discovery of human-comprehensible strategies for simple two-person games (Noughts-and-Crosses and Hexapawn).  ...  distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were  ... 
doi:10.1007/s00354-019-00054-2 fatcat:5ne62g4hljc47nlorlmf4h6zgy

Hybrid Online and Offline Reinforcement Learning for Tibetan Jiu Chess

Xiali Li, Zhengyu Lv, Licheng Wu, Yue Zhao, Xiaona Xu
2020 Complexity  
Q-learning is also used to update all the nodes on the search path when each game ends.  ...  An improved deep neural network based on ResNet18 is used for self-play training.  ...  Acknowledgments is study was funded by the National Natural Science Foundation of China (61873291 and 61773416) and the MUC 111 Project.  ... 
doi:10.1155/2020/4708075 fatcat:nd4ncim3ybfsnantm4zfaslr3u

Multi-Robot Cooperation Strategy in Game Environment Using Deep Reinforcement Learning

Hongda Zhang, Decai Li, Yuqing He
2018 2018 IEEE International Conference on Robotics and Biomimetics (ROBIO)  
On this basis, we conduct a Nash equilibrium game strategy analysis on the specific multi-agent game problem--the territory defense, use deep Q learning method to learn the defender's joint defense strategy  ...  To this end, based on the deep reinforcement learning method, we analyze the multi-agent collaboration strategy in the game environment and propose a learning method that can measure cooperative information  ...  We conduct a Nash equilibrium game strategy analysis on the specific multi-agent game problem--the territory defense, and use deep Q learning to learn the defender's joint defense strategy.  ... 
doi:10.1109/robio.2018.8665165 dblp:conf/robio/ZhangLH18 fatcat:z6dsgnkvsbgvtcfcmrqqyci72i

Can Meta-Interpretive Learning outperform Deep Reinforcement Learning of Evaluable Game strategies? [article]

Céline Hocquette, Stephen H. Muggleton
2019 arXiv   pre-print
We use these games to compare Cumulative Minimax Regret for variants of both standard and deep reinforcement learning against two variants of a new Meta-Interpretive Learning system called MIGO.  ...  In our experiments all tested variants of both normal and deep reinforcement learning have worse performance (higher cumulative minimax regret) than both variants of MIGO on Noughts-and-Crosses and Hexapawn  ...  However, most systems aim at learning single agent policy and, in contrast to MIGO, are not designed to learn to play two person games.  ... 
arXiv:1902.09835v1 fatcat:xe24mjababbkbibpsrssjhmnte

How can ignorant but patient cognitive terminals learn their strategy and utility?

S.M. Perlaza, H. Tembine, S. Lasaulce
2010 2010 IEEE 11th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC)  
This algorithm possesses attractive convergence properties not available for standard reinforcement learning algorithms and in addition, it allows each transmitter to learn both its optimal strategy and  ...  For this purpose, the framework of learning theory in games is exploited. Here, a new learning algorithm based on mild information assumptions at the transmitters is presented.  ...  u i,j ) and the BG learning used in games to learn strategies only (which can be recovered by choosing λ i,j → 1 and ignoring the second equation).  ... 
doi:10.1109/spawc.2010.5670983 fatcat:oqtitqt3avhxrdtaies2qjt77e

MimicBot: Combining Imitation and Reinforcement Learning to win in Bot Bowl [article]

Nicola Pezzotti
2021 arXiv   pre-print
The agent, MimicBot, is implemented using a specifically designed deep policy network and trained using a combination of imitation and reinforcement learning.  ...  Currently no machine learning approach can beat a scripted bot which makes use of the domain knowledge on the game.  ...  Acknowledgments Thanks to Niels Justesen and Mattias Bermell Rudfeldt for the discussions and Niels Justesen and the other FFAI creators for the support in the creation of MimicBot.  ... 
arXiv:2108.09478v1 fatcat:zvypsdassbhlla6frf73pcnjqm

Coevolutionary Temporal Difference Learning for Othello

Marcin Szubert, Wojciech Jaskowski, Krzysztof Krawiec
2009 2009 IEEE Symposium on Computational Intelligence and Games  
We apply CTDL to the board game of Othello, using weighted piece counter for representing players' strategies.  ...  This paper presents Coevolutionary Temporal Difference Learning (CTDL), a novel way of hybridizing coevolutionary search with reinforcement learning that works by interlacing one-population competitive  ...  ACKNOWLEDGMENTS This work was supported in part by Ministry of Science and Higher Education grant # N N519 3505 33 and grant POIG.01.01.02-00-014/08-00.  ... 
doi:10.1109/cig.2009.5286486 dblp:conf/cig/SzubertJK09 fatcat:2byzeqgxzbb33ju7fhbekwlli4

Integrating learning with motor schema-based control for a Robot Soccer Team [chapter]

Tucker Balch
1998 Lecture Notes in Computer Science  
In overview: each agent is provided a common set of skills (motor schema-based behavioral assemblages) from which it builds a task-achieving strategy using reinforcement learning.  ...  This paper describes a reinforcement learning-based strategy developed for Robocup simulator league competition.  ...  A series of 100 10-point games are played with information on policy convergence and score recorded after each game. The robots retain their learning set between games.  ... 
doi:10.1007/3-540-64473-3_86 fatcat:4houy65jird6fovs2kja6i44p4
« Previous Showing results 1 — 15 out of 39,783 results