A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Sample Efficient Training in Multi-Agent Adversarial Games with Limited Teammate Communication
[article]
2020
arXiv
pre-print
The defining feature of our algorithm is achieving sample efficiency within a restrictive computational budget while beating the previous years learning agents. ...
We show that the proposed approach is able to achieve competitive performance within half a million games of training, significantly faster than other studies in the literature. ...
Communication Protocol In TeamRadio variant of Pommerman, limited communication between the agents is allowed. ...
arXiv:2011.00424v1
fatcat:yeqneqmxcrcvnig5ch3snvwa4a
Multi-agent reinforcement learning with approximate model learning for competitive games
2019
PLoS ONE
In the test phase, we use competitive multi-agent environments to demonstrate by comparison the usefulness and superiority of the proposed method in terms of learning efficiency and goal achievements. ...
The actor networks enable the agents to communicate using forward and backward paths while the critic network helps to train the actors by delivering them gradient signals based on their contribution to ...
With deterministic policy gradients, agents learn differentiable communication protocols for coordinating with others. ...
doi:10.1371/journal.pone.0222215
pmid:31509568
pmcid:PMC6739057
fatcat:4lp2babw2bcjxnytrqp6cqrnqq
Team-partitioned, opaque-transition reinforcement learning
1999
Proceedings of the third annual conference on Autonomous Agents - AGENTS '99
Multi-agent scenarios are opaque-transition, as team members are not always in full communication with one another and adversaries may affect the environment. ...
It is an adaptation of traditional RL methods that is applicable in complex, non-Markovian, multi-agent domains with large state spaces and limited training opportunities. ...
Conclusion TPOT-RL is an adaptation of RL to non-Markovian multi-agent domains with opaque transitions, large state spaces, hidden state and limited training opportunities. ...
doi:10.1145/301136.301195
dblp:conf/agents/StoneV99
fatcat:lvmn5vaw2rdepbcrctpw4uxzp4
Team-Partitioned, Opaque-Transition Reinforcement Learning
[chapter]
1999
Lecture Notes in Computer Science
Multi-agent scenarios are opaque-transition, as team members are not always in full communication with one another and adversaries may affect the environment. ...
It is an adaptation of traditional RL methods that is applicable in complex, non-Markovian, multi-agent domains with large state spaces and limited training opportunities. ...
Conclusion TPOT-RL is an adaptation of RL to non-Markovian multi-agent domains with opaque transitions, large state spaces, hidden state and limited training opportunities. ...
doi:10.1007/3-540-48422-1_21
fatcat:lclqx5qoenhtlm6minc47o6chu
The Hanabi Challenge: A New Frontier for AI Research
[article]
2019
arXiv
pre-print
In recent years, machine learning has made dramatic advances with artificial agents reaching superhuman performance in challenge domains like Go, Atari, and some variants of poker. ...
We believe developing novel techniques capable of imbuing artificial agents with such theory of mind will not only be crucial for their success in Hanabi, but also in broader collaborative efforts, and ...
Acknowledgements We would like to thank many people: Matthieu d'Epenoux of Cocktail Games and Antoine Bauza, who designed Hanabi, for their support on this project; Alden Christianson for help with coordinating ...
arXiv:1902.00506v1
fatcat:ri6mermrefdffg4bvnixt5hdu4
Adversarially Guided Self-Play for Adopting Social Conventions
[article]
2020
arXiv
pre-print
Robotic agents must adopt existing social conventions in order to be effective teammates. ...
Prior work has identified a method of combining self-play with paired input-output data gathered from existing agents in order to learn their social convention without interacting with them. ...
Finally, our third domain tests three-agent teams in a multi-step coordination game involving multiple communication conventions. ...
arXiv:2001.05994v2
fatcat:xxrj6uguvvc3zpp6drt4ysyrbe
A Survey of Ad Hoc Teamwork: Definitions, Methods, and Open Problems
[article]
2022
arXiv
pre-print
Ad hoc teamwork is the well-established research problem of designing agents that can collaborate with new teammates without prior coordination. This survey makes a two-fold contribution. ...
Second, it discusses the progress that has been made in the field so far, and identifies the immediate and long-term open problems that need to be addressed in the field of ad hoc teamwork. ...
Unlike multi-agent reinforcement learning (see Section 3.2), which supports joint training for all agents in the team, AHT does not assume that the deployed teammates are the same as those the learner ...
arXiv:2202.10450v1
fatcat:twqmsvosbjbepnyxf3f3v6hlba
School of hard knocks: Curriculum analysis for Pommerman with a fixed computational budget
[article]
2021
arXiv
pre-print
Pommerman is a hybrid cooperative/adversarial multi-agent environment, with challenging characteristics in terms of partial observability, limited or no communication, sparse and delayed rewards, and restrictive ...
In this paper, we focus on developing a curriculum for learning a robust and promising policy in a constrained computational budget of 100,000 games, starting from a fixed base policy (which is itself ...
Conclusion In a hybrid cooperative/adversarial multi-agent game such as Pommerman, curriculum learning is a popular way of accelerating training. ...
arXiv:2102.11762v2
fatcat:il54s3qbmjdzlk6yos3kva22bq
From Motor Control to Team Play in Simulated Humanoid Football
[article]
2021
arXiv
pre-print
that extend far beyond the body itself, ultimately involving coordination with other agents. ...
However, there is limited research aimed at their integration. We study this problem by training teams of physically simulated humanoid avatars to play football in a realistic virtual environment. ...
Tobias Springenberg and Peter Stone for their insights and helpful comments; Tyler Liechty and Amy Merrick at DeepMind for assistance obtaining motion capture data; and Darren Carikas for assistance with ...
arXiv:2105.12196v1
fatcat:owunoflhgbfofo63hhqbkyg4hy
Promoting Coordination through Policy Regularization in Multi-Agent Deep Reinforcement Learning
[article]
2020
arXiv
pre-print
In multi-agent reinforcement learning, discovering successful collective behaviors is challenging as it requires exploring a joint action space that grows exponentially with the number of agents. ...
Our experiments show improved performance across many cooperative multi-agent problems. ...
We also acknowledge funding in support of this work from the Fonds de Recherche Nature et Technologies (FRQNT) and Mitacs, as well as Compute Canada for supplying computing resources. ...
arXiv:1908.02269v4
fatcat:rnk6zpb2sfcl7bzpct6zaupipq
Winning Isn't Everything: Enhancing Game Development with Intelligent Agents
[article]
2020
arXiv
pre-print
In this paper, we study the problem of training intelligent agents in service of game development. ...
Unlike the agents built to "beat the game", our agents aim to produce human-like behavior to help with game evaluation and balancing. ...
ACKNOWLEDGEMENT The authors are thankful to EA Sports and other game team partners for their support and collaboration. ...
arXiv:1903.10545v5
fatcat:v2vc7pxzvrfbbiezdttz7mfyze
Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning
[article]
2020
arXiv
pre-print
of training agents in a stage-wise manner. ...
In multi-agent games, the complexity of the environment can grow exponentially as the number of agents increases, so it is particularly challenging to learn good policies when the agent population is large ...
In the adversarial battle game, Similar to grassland game, the entity types are agent itself, other teammates, enemies and food. ...
arXiv:2003.10423v1
fatcat:e7csz6bdprdbnc6nqxfb4l76ee
Continual Match Based Training in Pommerman: Technical Report
[article]
2018
arXiv
pre-print
with no communication. ...
Specifically, we propose a COnitnual Match BAsed Training (COMBAT) framework for training a population of advantage-actor-critic (A2C) agents in Pommerman, a partially observable multi-agent environment ...
INTRODUCTION Pommerman [17] is a multi-agent environment based on the classic console game Bomberman. ...
arXiv:1812.07297v1
fatcat:qm6vwut5ovhxhluak5b7yet6oa
Multi-Agent Inverse Reinforcement Learning: Suboptimal Demonstrations and Alternative Solution Concepts
[article]
2021
arXiv
pre-print
Methods which use recursive reasoning or updating also perform well, including the feedback NE and archive multi-agent adversarial IRL. ...
Multi-agent inverse reinforcement learning (MIRL) can be used to learn reward functions from agents in social environments. ...
In the initial policy step, an adversarial training algorithm solves for a NE strategy in a zero-sum stochastic game parameterized by the current reward function. ...
arXiv:2109.01178v1
fatcat:v635kuj4wfg4nndxvclbaubv2q
Scaling Up Multiagent Reinforcement Learning for Robotic Systems: Learn an Adaptive Sparse Communication Graph
[article]
2020
arXiv
pre-print
Through this sparsity structure, the agents can communicate in an effective as well as efficient way via only selectively attending to agents that matter the most and thus the scale of the MARL problem ...
The complexity of multiagent reinforcement learning (MARL) in multiagent systems increases exponentially with respect to the agent number. ...
As a stargraph is a connected graph with possibly minimum edges, this communication protocol is both effective and efficient. ...
arXiv:2003.01040v2
fatcat:ch3zgjpd2vcmbjpfis5nm3gglm
« Previous
Showing results 1 — 15 out of 245 results