A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Accelerating Training in Pommerman with Imitation and Reinforcement Learning
[article]
2019
arXiv
pre-print
The proposed methodology is able to beat heuristic and pure reinforcement learning baselines with a combined 100,000 training games, significantly faster than other non-tree-search methods in literature ...
Our methodology involves training an agent initially through imitation learning on a noisy expert policy, followed by a proximal-policy optimization (PPO) reinforcement learning algorithm. ...
Conclusion We posit that the use of imitation followed by reinforcement learning is an effective way to reduce the training effort in Pommerman. ...
arXiv:1911.04947v2
fatcat:k73v3b6chfccpjbkw7ncl6mgei
Efficient Searching with MCTS and Imitation Learning: A Case Study in Pommerman
2021
IEEE Access
In this paper, we propose an efficient reinforcement learning approach that uses a more efficient Monte Carlo tree search combined with action pruning and flexible imitation learning to accelerate the ...
Pommerman is a popular reinforcement learning environment because it imposes several challenges such as sparse and deceptive rewards and delayed action effects. ...
IMITATION LEARNING WITH BACKPLAY Although the action filters could accelerate the searching performance, we still think it is insufficient to be applied in a real-time running condition in Pommerman. ...
doi:10.1109/access.2021.3061313
fatcat:sjcm6dm53bcgjneizslp5c5anu
Safer Deep RL with Shallow MCTS: A Case Study in Pommerman
[article]
2019
arXiv
pre-print
Safe reinforcement learning has many variants and it is still an open research problem. ...
Compared to vanilla deep RL algorithms, our proposed methods both learn faster and converge to better policies on a two-player mini version of the Pommerman game. ...
PRELIMINARIES 3.1 Reinforcement Learning We start with the standard reinforcement learning setting of an agent interacting in an environment over a discrete number of steps. ...
arXiv:1904.05759v1
fatcat:b7kalbktibcadgt6qdykkg3cce
School of hard knocks: Curriculum analysis for Pommerman with a fixed computational budget
[article]
2021
arXiv
pre-print
trained to imitate a noisy expert policy). ...
Pommerman is a hybrid cooperative/adversarial multi-agent environment, with challenging characteristics in terms of partial observability, limited or no communication, sparse and delayed rewards, and restrictive ...
Conclusion In a hybrid cooperative/adversarial multi-agent game such as Pommerman, curriculum learning is a popular way of accelerating training. ...
arXiv:2102.11762v2
fatcat:il54s3qbmjdzlk6yos3kva22bq
Using Monte Carlo Tree Search as a Demonstrator within Asynchronous Deep RL
[article]
2018
arXiv
pre-print
Deep reinforcement learning (DRL) has achieved great successes in recent years with the help of novel methods and higher compute power. ...
Compared to vanilla A3C, our proposed methods both learn faster and converge to better policies on a two-player mini version of the Pommerman game. ...
Introduction Deep reinforcement learning (DRL) combines reinforcement learning (Sutton and Barto 1998) with deep learning (LeCun, Bengio, and Hinton 2015) , enabling better scalability and generalization ...
arXiv:1812.00045v1
fatcat:u7vv6pzp6rdqzcfqexutlpeqiu
Sample Efficient Training in Multi-Agent Adversarial Games with Limited Teammate Communication
[article]
2020
arXiv
pre-print
We describe our solution approach for Pommerman TeamRadio, a competition environment associated with NeurIPS 2019. ...
signal to each agent during training and (iv) uses masking for catastrophic bad actions. ...
In [5] they propose curriculum based learning along with imitation learning to solve the accelerate (a) (b) Figure 1 : Sample board of game the training process in the complex environment as pommerman.We ...
arXiv:2011.00424v1
fatcat:yeqneqmxcrcvnig5ch3snvwa4a
Backplay: "Man muss immer umkehren"
[article]
2022
arXiv
pre-print
Model-free reinforcement learning (RL) requires a large number of trials to learn a good policy, especially in environments with sparse rewards. ...
Rather than starting each training episode in the environment's fixed initial state, we start the agent near the end of the demonstration and move the starting point backwards during the course of training ...
Pommerman can be difficult for reinforcement learning agents. The agent must learn to effectively wield the bomb action in order to win against competent opponents. ...
arXiv:1807.06919v5
fatcat:qh6zphhvubb5zojoylvppsoi7a
TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning
[article]
2020
arXiv
pre-print
Competitive Self-Play (CSP) based Multi-Agent Reinforcement Learning (MARL) has shown phenomenal breakthroughs recently. ...
Despite the success, the MARL training is extremely data thirsty, requiring typically billions of (if not trillions of) frames be seen from the environment during training in order for learning a high ...
Thanks Zhuobin Zheng (jackzbzheng) and Jiaming Lu (loyavejmlu) for initiating the ViZDoom experiments during the internship with Tencent AI Lab. ...
arXiv:2011.12895v2
fatcat:ntgchwkkt5b3nhjfveiuihd7tu
Multi-Agent Advisor Q-Learning
2022
The Journal of Artificial Intelligence Research
In the last decade, there have been significant advances in multi-agent reinforcement learning (MARL) but there are still numerous challenges, such as high sample complexity and slow convergence to stable ...
We describe the problem of ADvising Multiple Intelligent Reinforcement Agents (ADMIRAL) in nonrestrictive general-sum stochastic game environments and present two novel Q-learning based algorithms: ADMIRAL ...
Part of this work has taken place in the Intelligent Robot Learning (IRL) Lab at the University of Alberta, which is supported in part by research grants from the Alberta Machine Intelligence Institute ...
doi:10.1613/jair.1.13445
fatcat:tgvw3lf62bdp5jlnxrakeszw5i
Multi-Agent Advisor Q-Learning
[article]
2022
arXiv
pre-print
In the last decade, there have been significant advances in multi-agent reinforcement learning (MARL) but there are still numerous challenges, such as high sample complexity and slow convergence to stable ...
We describe the problem of ADvising Multiple Intelligent Reinforcement Agents (ADMIRAL) in nonrestrictive general-sum stochastic game environments and present two novel Q-learning based algorithms: ADMIRAL ...
Part of this work has taken place in the Intelligent Robot Learning (IRL) Lab at the University of Alberta, which is supported in part by research grants from the Alberta Machine Intelligence Institute ...
arXiv:2111.00345v5
fatcat:o3ybmwuuubebnn4ntqfxrhwzjy
Applied Machine Learning for Games: A Graduate School Course
[article]
2021
arXiv
pre-print
In this paper, we describe our machine learning course designed for graduate students interested in applying recent advances of deep learning and reinforcement learning towards gaming. ...
Student projects cover use-cases such as training AI-bots in gaming benchmark environments and competitions, understanding human decision patterns in gaming, and creating intelligent non-playable characters ...
DQN, Imitation Learning, Policy Gradients, and Transfer Learning were experimented with to train the agent to drive. ...
arXiv:2012.01148v2
fatcat:f44ln32jnbfhrearv234ylteru
A Survey of Deep Reinforcement Learning in Video Games
[article]
2019
arXiv
pre-print
Deep reinforcement learning (DRL) has made great achievements since proposed. ...
This learning mechanism updates the policy to maximize the return with an end-to-end method. ...
ACKNOWLEDGMENT The authors would like to thank Qichao Zhang, Dong Li and Weifan Li for the helpful comments and discussions about this work. ...
arXiv:1912.10944v2
fatcat:fsuzp2sjrfcgfkyclrsyzflax4
A Survey and Critique of Multiagent Deep Reinforcement Learning
[article]
2019
arXiv
pre-print
Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has led to a dramatic increase in the number of applications and methods. ...
reinforcement learning settings. ...
for her visual designs for the figures in the article, to Frans Oliehoek, Sam Devlin, Marc Lanctot, Nolan Bard, Roberta Raileanu, Angeliki Lazaridou, and Yuhang Song for clarifications in their areas of ...
arXiv:1810.05587v2
fatcat:h4ei5zx2xfa7xocktlefjrvef4
Dagstuhl Reports, Volume 9, Issue 12, December 2019, Complete Issue
2020
AI for Accessibility in Games Tommy Thompson ...
In contrast, there has been very little progress on this kind of problem in the machine learning and reinforcement learning community. ...
This would not only aid in the use of abstract FMs for commercial games, but also in applications of automatically learning FMs with model-based reinforcement learning approaches. ...
doi:10.4230/dagrep.9.12
fatcat:hebigxkvinhjdb6qlg3j5hw25u