Filters








14 Hits in 2.4 sec

Accelerating Training in Pommerman with Imitation and Reinforcement Learning [article]

Hardik Meisheri, Omkar Shelke, Richa Verma, Harshad Khadilkar
2019 arXiv   pre-print
The proposed methodology is able to beat heuristic and pure reinforcement learning baselines with a combined 100,000 training games, significantly faster than other non-tree-search methods in literature  ...  Our methodology involves training an agent initially through imitation learning on a noisy expert policy, followed by a proximal-policy optimization (PPO) reinforcement learning algorithm.  ...  Conclusion We posit that the use of imitation followed by reinforcement learning is an effective way to reduce the training effort in Pommerman.  ... 
arXiv:1911.04947v2 fatcat:k73v3b6chfccpjbkw7ncl6mgei

Efficient Searching with MCTS and Imitation Learning: A Case Study in Pommerman

Hailan Yang, Shengze Li, Xinhai Xu, Xunyun Liu, Zhuxuan Meng, Yongjun Zhang
2021 IEEE Access  
In this paper, we propose an efficient reinforcement learning approach that uses a more efficient Monte Carlo tree search combined with action pruning and flexible imitation learning to accelerate the  ...  Pommerman is a popular reinforcement learning environment because it imposes several challenges such as sparse and deceptive rewards and delayed action effects.  ...  IMITATION LEARNING WITH BACKPLAY Although the action filters could accelerate the searching performance, we still think it is insufficient to be applied in a real-time running condition in Pommerman.  ... 
doi:10.1109/access.2021.3061313 fatcat:sjcm6dm53bcgjneizslp5c5anu

Safer Deep RL with Shallow MCTS: A Case Study in Pommerman [article]

Bilal Kartal, Pablo Hernandez-Leal, Chao Gao, Matthew E. Taylor
2019 arXiv   pre-print
Safe reinforcement learning has many variants and it is still an open research problem.  ...  Compared to vanilla deep RL algorithms, our proposed methods both learn faster and converge to better policies on a two-player mini version of the Pommerman game.  ...  PRELIMINARIES 3.1 Reinforcement Learning We start with the standard reinforcement learning setting of an agent interacting in an environment over a discrete number of steps.  ... 
arXiv:1904.05759v1 fatcat:b7kalbktibcadgt6qdykkg3cce

School of hard knocks: Curriculum analysis for Pommerman with a fixed computational budget [article]

Omkar Shelke, Hardik Meisheri, Harshad Khadilkar
2021 arXiv   pre-print
trained to imitate a noisy expert policy).  ...  Pommerman is a hybrid cooperative/adversarial multi-agent environment, with challenging characteristics in terms of partial observability, limited or no communication, sparse and delayed rewards, and restrictive  ...  Conclusion In a hybrid cooperative/adversarial multi-agent game such as Pommerman, curriculum learning is a popular way of accelerating training.  ... 
arXiv:2102.11762v2 fatcat:il54s3qbmjdzlk6yos3kva22bq

Using Monte Carlo Tree Search as a Demonstrator within Asynchronous Deep RL [article]

Bilal Kartal, Pablo Hernandez-Leal, Matthew E. Taylor
2018 arXiv   pre-print
Deep reinforcement learning (DRL) has achieved great successes in recent years with the help of novel methods and higher compute power.  ...  Compared to vanilla A3C, our proposed methods both learn faster and converge to better policies on a two-player mini version of the Pommerman game.  ...  Introduction Deep reinforcement learning (DRL) combines reinforcement learning (Sutton and Barto 1998) with deep learning (LeCun, Bengio, and Hinton 2015) , enabling better scalability and generalization  ... 
arXiv:1812.00045v1 fatcat:u7vv6pzp6rdqzcfqexutlpeqiu

Sample Efficient Training in Multi-Agent Adversarial Games with Limited Teammate Communication [article]

Hardik Meisheri, Harshad Khadilkar
2020 arXiv   pre-print
We describe our solution approach for Pommerman TeamRadio, a competition environment associated with NeurIPS 2019.  ...  signal to each agent during training and (iv) uses masking for catastrophic bad actions.  ...  In [5] they propose curriculum based learning along with imitation learning to solve the accelerate (a) (b) Figure 1 : Sample board of game the training process in the complex environment as pommerman.We  ... 
arXiv:2011.00424v1 fatcat:yeqneqmxcrcvnig5ch3snvwa4a

Backplay: "Man muss immer umkehren" [article]

Cinjon Resnick, Roberta Raileanu, Sanyam Kapoor, Alexander Peysakhovich, Kyunghyun Cho, Joan Bruna
2022 arXiv   pre-print
Model-free reinforcement learning (RL) requires a large number of trials to learn a good policy, especially in environments with sparse rewards.  ...  Rather than starting each training episode in the environment's fixed initial state, we start the agent near the end of the demonstration and move the starting point backwards during the course of training  ...  Pommerman can be difficult for reinforcement learning agents. The agent must learn to effectively wield the bomb action in order to win against competent opponents.  ... 
arXiv:1807.06919v5 fatcat:qh6zphhvubb5zojoylvppsoi7a

TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning [article]

Peng Sun, Jiechao Xiong, Lei Han, Xinghai Sun, Shuxing Li, Jiawei Xu, Meng Fang, Zhengyou Zhang
2020 arXiv   pre-print
Competitive Self-Play (CSP) based Multi-Agent Reinforcement Learning (MARL) has shown phenomenal breakthroughs recently.  ...  Despite the success, the MARL training is extremely data thirsty, requiring typically billions of (if not trillions of) frames be seen from the environment during training in order for learning a high  ...  Thanks Zhuobin Zheng (jackzbzheng) and Jiaming Lu (loyavejmlu) for initiating the ViZDoom experiments during the internship with Tencent AI Lab.  ... 
arXiv:2011.12895v2 fatcat:ntgchwkkt5b3nhjfveiuihd7tu

Multi-Agent Advisor Q-Learning

Sriram Ganapathi Subramanian, Matthew E. Taylor, Kate Larson, Mark Crowley
2022 The Journal of Artificial Intelligence Research  
In the last decade, there have been significant advances in multi-agent reinforcement learning (MARL) but there are still numerous challenges, such as high sample complexity and slow convergence to stable  ...  We describe the problem of ADvising Multiple Intelligent Reinforcement Agents (ADMIRAL) in nonrestrictive general-sum stochastic game environments and present two novel Q-learning based algorithms: ADMIRAL  ...  Part of this work has taken place in the Intelligent Robot Learning (IRL) Lab at the University of Alberta, which is supported in part by research grants from the Alberta Machine Intelligence Institute  ... 
doi:10.1613/jair.1.13445 fatcat:tgvw3lf62bdp5jlnxrakeszw5i

Multi-Agent Advisor Q-Learning [article]

Sriram Ganapathi Subramanian, Matthew E. Taylor, Kate Larson, Mark Crowley
2022 arXiv   pre-print
In the last decade, there have been significant advances in multi-agent reinforcement learning (MARL) but there are still numerous challenges, such as high sample complexity and slow convergence to stable  ...  We describe the problem of ADvising Multiple Intelligent Reinforcement Agents (ADMIRAL) in nonrestrictive general-sum stochastic game environments and present two novel Q-learning based algorithms: ADMIRAL  ...  Part of this work has taken place in the Intelligent Robot Learning (IRL) Lab at the University of Alberta, which is supported in part by research grants from the Alberta Machine Intelligence Institute  ... 
arXiv:2111.00345v5 fatcat:o3ybmwuuubebnn4ntqfxrhwzjy

Applied Machine Learning for Games: A Graduate School Course [article]

Yilei Zeng, Aayush Shah, Jameson Thai, Michael Zyda
2021 arXiv   pre-print
In this paper, we describe our machine learning course designed for graduate students interested in applying recent advances of deep learning and reinforcement learning towards gaming.  ...  Student projects cover use-cases such as training AI-bots in gaming benchmark environments and competitions, understanding human decision patterns in gaming, and creating intelligent non-playable characters  ...  DQN, Imitation Learning, Policy Gradients, and Transfer Learning were experimented with to train the agent to drive.  ... 
arXiv:2012.01148v2 fatcat:f44ln32jnbfhrearv234ylteru

A Survey of Deep Reinforcement Learning in Video Games [article]

Kun Shao, Zhentao Tang, Yuanheng Zhu, Nannan Li, Dongbin Zhao
2019 arXiv   pre-print
Deep reinforcement learning (DRL) has made great achievements since proposed.  ...  This learning mechanism updates the policy to maximize the return with an end-to-end method.  ...  ACKNOWLEDGMENT The authors would like to thank Qichao Zhang, Dong Li and Weifan Li for the helpful comments and discussions about this work.  ... 
arXiv:1912.10944v2 fatcat:fsuzp2sjrfcgfkyclrsyzflax4

A Survey and Critique of Multiagent Deep Reinforcement Learning [article]

Pablo Hernandez-Leal, Bilal Kartal, Matthew E. Taylor
2019 arXiv   pre-print
Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has led to a dramatic increase in the number of applications and methods.  ...  reinforcement learning settings.  ...  for her visual designs for the figures in the article, to Frans Oliehoek, Sam Devlin, Marc Lanctot, Nolan Bard, Roberta Raileanu, Angeliki Lazaridou, and Yuhang Song for clarifications in their areas of  ... 
arXiv:1810.05587v2 fatcat:h4ei5zx2xfa7xocktlefjrvef4

Dagstuhl Reports, Volume 9, Issue 12, December 2019, Complete Issue

2020
AI for Accessibility in Games Tommy Thompson  ...  In contrast, there has been very little progress on this kind of problem in the machine learning and reinforcement learning community.  ...  This would not only aid in the use of abstract FMs for commercial games, but also in applications of automatically learning FMs with model-based reinforcement learning approaches.  ... 
doi:10.4230/dagrep.9.12 fatcat:hebigxkvinhjdb6qlg3j5hw25u