2,238 Hits in 8.0 sec

Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games [article]

Stefanos Leonardos, Will Overman, Ioannis Panageas, Georgios Piliouras
2021 arXiv   pre-print
In our main technical result, we prove fast convergence of independent policy gradient to Nash policies by adapting recent gradient dominance property arguments developed for single agent MDPs to multi-agent  ...  We present a novel definition of Markov Potential Games (MPG) that generalizes prior attempts at capturing complex stateful multi-agent coordination.  ...  Acknowledgements This project is supported in part by NRF2019-NRF-ANR095 ALIAS grant, grant PIE-SGP-AI-2018-01, NRF 2018 Fellowship NRF-NRFF2018-07, AME Programmatic Fund (Grant No.  ... 
arXiv:2106.01969v3 fatcat:43d5kzbc25agflewwhte3m3ora

Independent Natural Policy Gradient Always Converges in Markov Potential Games [article]

Roy Fox, Stephen McAleer, Will Overman, Ioannis Panageas
2021 arXiv   pre-print
In this paper, we focus on a particular class of multi-agent mixed cooperative/competitive stochastic games called Markov Potential Games (MPGs), which include cooperative games as a special case.  ...  Recent results have shown that independent policy gradient converges in MPGs but it was not known whether Independent Natural Policy Gradient converges in MPGs as well.  ...  Our main technical result is to show that Indepedent Natural Policy Gradient (INPG) globally converges to equilibrium policies for a fixed stepsize in Markov Potential Games.  ... 
arXiv:2110.10614v1 fatcat:yrs4t2frjvbjho4fl5pdlpdkhq

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms [article]

Kaiqing Zhang, Zhuoran Yang, Tamer Başar
2021 arXiv   pre-print
the mean-field regime, (non-)convergence of policy-based methods for learning in games, etc.  ...  multi-agent RL (MARL), a domain with a relatively long history, and has recently re-emerged due to advances in single-agent RL techniques.  ...  Markov Potential Games From a game-theoretic perspective, a more general framework to embrace cooperation is potential games [163] , where there exits some potential function shared by all agents, such  ... 
arXiv:1911.10635v2 fatcat:ihlhtjlhnrdizbkcfzsnz5urfq

Multi-Agent Reinforcement Learning with Temporal Logic Specifications [article]

Lewis Hammond and Alessandro Abate and Julian Gutierrez and Michael Wooldridge
2021 arXiv   pre-print
We provide correctness and convergence guarantees for our main algorithm - ALMANAC (Automaton/Logic Multi-Agent Natural Actor-Critic) - even when using function approximation.  ...  In this paper, we study the problem of learning to satisfy temporal logic specifications with a group of agents in an unknown environment, which may exhibit probabilistic behaviour.  ...  Hammond acknowledges the support of an EPSRC Doctoral Training Partnership studentship (Reference: 2218880) and the University of Oxford ARC facility. 2 Wooldridge and Abate acknowledge the support of  ... 
arXiv:2102.00582v2 fatcat:wq2vons5sbhkbjjvq3iykksocy

Algorithms in Multi-Agent Systems: A Holistic Perspective from Reinforcement Learning and Game Theory [article]

Yunlong Lu, Kai Yan
2020 arXiv   pre-print
Solution concepts from game theory give inspiration to algorithms which try to evaluate the agents or find better solutions in multi-agent systems.  ...  Fictitious self-play becomes popular and has a great impact on the algorithm of multi-agent reinforcement learning.  ...  It is later proved by [68] that four special types of games can converge globally: games with an interior ESS (Evolutionary Stable Strategy), zero-sum games, potential games and all super-modular games  ... 
arXiv:2001.06487v3 fatcat:o2iovnsbxfgp5omk67jwuoonma

Polymatrix Competitive Gradient Descent [article]

Jeffrey Ma, Alistair Letcher, Florian Schäfer, Yuanyuan Shi, Anima Anandkumar
2021 arXiv   pre-print
We use PCGD to optimize policies in multi-agent reinforcement learning and demonstrate its advantages in Snake, Markov soccer and an electricity market game.  ...  Agents trained by PCGD outperform agents trained with simultaneous gradient descent, symplectic gradient adjustment, and extragradient in Snake and Markov soccer games and on the electricity market game  ...  FS gratefully acknowledge support by the Air Force Office of Scientific Research under award number FA9550-18-1-0271 (Games for Computation and Learning) and the Ronald and Maxine Linde Institute of Economic  ... 
arXiv:2111.08565v1 fatcat:afc7wklgrjcede3cy7h4ionz4e

Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence [article]

Dongsheng Ding and Chen-Yu Wei and Kaiqing Zhang and Mihailo R. Jovanović
2022 arXiv   pre-print
We examine global non-asymptotic convergence properties of policy gradient methods for multi-agent reinforcement learning (RL) problems in Markov potential games (MPG).  ...  Moreover, we identify a class of independent policy gradient algorithms that enjoys convergence for both zero-sum Markov games and Markov cooperative games with the players that are oblivious to the types  ...  Acknowledgement The work of D. Ding and M. R. Jovanović is supported in part by the National Science Foundation under awards ECCS-1708906 and 1809833.  ... 
arXiv:2202.04129v2 fatcat:gzchgvmtorfjhnzcsmcuup2q3e

Bi-level Actor-Critic for Multi-agent Coordination [article]

Haifeng Zhang, Weizhe Chen, Zeren Huang, Minne Li, Yaodong Yang, Weinan Zhang, Jun Wang
2020 arXiv   pre-print
Coordination is one of the essential problems in multi-agent systems.  ...  In this paper, we treat agents unequally and consider Stackelberg equilibrium as a potentially better convergence point than Nash equilibrium in terms of Pareto superiority, especially in cooperative environments  ...  Acknowledgments This work is supported by "New Generation of AI 2030" Major Project 2018AAA0100900 and NSFC 61702327, 61632017.  ... 
arXiv:1909.03510v3 fatcat:2wcqmkmzcfdmpnond7z223lh3a

Gradient play in stochastic games: stationary points, convergence, and sample complexity [article]

Runyu Zhang, Zhaolin Ren, Na Li
2021 arXiv   pre-print
Further, for a subclass of SGs called Markov potential games (which includes the cooperative setting with identical rewards among agents as an important special case), we design a sample-based reinforcement  ...  learning algorithm and give a non-asymptotic global convergence rate analysis for both exact gradient play and our sample-based learning algorithm.  ...  Gradient play for Markov potential games We have discussed that the main problem for the global convergence of gradient play for general SGs is that the vector field {∇ θi J i (θ)} n i=1 is not conservative  ... 
arXiv:2106.00198v4 fatcat:odcv6fhgkjdbdcastxlbc5m6bu

Multi-Agent Actor-Critic with Generative Cooperative Policy Network [article]

Heechang Ryu, Hayong Shin, Jinkyoo Park
2018 arXiv   pre-print
We propose an efficient multi-agent reinforcement learning approach to derive equilibrium strategies for multi-agents who are participating in a Markov game.  ...  Mainly, we are focused on obtaining decentralized policies for agents to maximize the performance of a collaborative task by all the agents, which is similar to solving a decentralized Markov decision  ...  Deriving control policies for multi-agent system can be generally described as Markov Game (MG), an extension of Markov Decision Process (MDP) to a multi-agent system.  ... 
arXiv:1810.09206v1 fatcat:66pvdm42jnf3jnjqmuk7svo5ym

Bi-Level Actor-Critic for Multi-Agent Coordination

Haifeng Zhang, Weizhe Chen, Zeren Huang, Minne Li, Yaodong Yang, Weinan Zhang, Jun Wang
Coordination is one of the essential problems in multi-agent systems.  ...  In this paper, we treat agents unequally and consider Stackelberg equilibrium as a potentially better convergence point than Nash equilibrium in terms of Pareto superiority, especially in cooperative environments  ...  Acknowledgments This work is supported by "New Generation of AI 2030" Major Project 2018AAA0100900 and NSFC 61702327, 61632017.  ... 
doi:10.1609/aaai.v34i05.6226 fatcat:y5o2xzjkdrhirbfaq6il2q6mtq

Learning Existing Social Conventions via Observationally Augmented Self-Play [article]

Adam Lerer, Alexander Peysakhovich
2019 arXiv   pre-print
We consider the problem of an agent learning a policy for a coordination game in a simulated environment and then using this policy when it enters an existing group.  ...  A group's conventions can be viewed as a choice of equilibrium in a coordination game.  ...  CONVENTIONS IN MARKOV GAMES A partially observed Markov game G [22, 29] consists of a set of players P = {1, . . . , N }, a set of states S, a set of actions for every player A i with the global set  ... 
arXiv:1806.10071v3 fatcat:pbh4pwtk5nclvpx73glli36kua

Prosocial learning agents solve generalized Stag Hunts better than selfish ones [article]

Alexander Peysakhovich, Adam Lerer
2017 arXiv   pre-print
It is known that in general-sum games reactive training can lead groups of agents to converge to inefficient outcomes. We focus on one such class of environments: Stag Hunt games.  ...  Deep reinforcement learning has become an important paradigm for constructing agents that can enter complex multi-agent situations and improve their policies through experience.  ...  Details of Markov Game Training For grid world Markov games, we train policies π modeled by a multi-layer convolutional neural network.  ... 
arXiv:1709.02865v2 fatcat:ujxek3xbzvhlpgq4ficc63ldoi

Decentralized Multi-Agent Reinforcement Learning with Networked Agents: Recent Advances [article]

Kaiqing Zhang, Zhuoran Yang, Tamer Başar
2019 arXiv   pre-print
Multi-agent reinforcement learning (MARL) has long been a significant and everlasting research topic in both machine learning and control.  ...  In this paper, we review some recent advances a sub-area of this topic: decentralized MARL with networked agents.  ...  Markov/Stochastic Games: As a direct generalization of MDPs to the multi-agent setting, Markov games (MGs), also known as stochastic games (Shapley, 1953 ) has long been treated as a classical framework  ... 
arXiv:1912.03821v1 fatcat:555igege7balrb3iiavbkcj3dy

Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled Wireless Networks: A Tutorial [article]

Amal Feriani, Ekram Hossain
2020 arXiv   pre-print
In this context, this tutorial focuses on the role of DRL with an emphasis on deep Multi-Agent Reinforcement Learning (MARL) for AI-enabled 6G networks.  ...  The key enabling technologies of future 6G networks, such as intelligent meta-surfaces, aerial networks, and AI at the edge, involve more than one agent which motivates the importance of multi-agent learning  ...  The authors are with the Department of Electrical and Computer Engineering, University of Manitoba, Winnipeg, MB, Canada (e-mails:,  ... 
arXiv:2011.03615v1 fatcat:zzbotslc3vczxmwf72y43k3ica
« Previous Showing results 1 — 15 out of 2,238 results