552 Hits in 6.6 sec

Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games [article]

Yu Bai, Chi Jin, Huan Wang, Caiming Xiong
2021 arXiv   pre-print
It remains open how to learn the Stackelberg equilibrium -- an asymmetric analog of the Nash equilibrium -- in general-sum games efficiently from noisy samples.  ...  This paper initiates the theoretical study of sample-efficient learning of the Stackelberg equilibrium, in the bandit feedback setting where we only observe noisy samples of the reward.  ...  The authors also thank the Theory of Reinforcement Learning program at the Simons Institute (Fall 2020) for hosting the authors and incubating our initial discussions.  ... 
arXiv:2102.11494v3 fatcat:cboq3fnaqvbbhjtxxnk4kaijla

Convergence of Learning Dynamics in Stackelberg Games [article]

Tanner Fiez, Benjamin Chasnov, Lillian J. Ratliff
2019 arXiv   pre-print
equilibria in zero-sum games.  ...  This paper investigates the convergence of learning dynamics in Stackelberg games.  ...  We then draw connections between learning in Stackelberg games and existing work in zero-sum and general sum-games relevant to GANs and multi-agent learning, respectively.  ... 
arXiv:1906.01217v3 fatcat:awbusd3qlbebvnhwnf2emjqtdu

Implicit Learning Dynamics in Stackelberg Games: Equilibria Characterization, Convergence Analysis, and Empirical Study

Tanner Fiez, Benjamin Chasnov, Lillian J. Ratliff
2020 International Conference on Machine Learning  
We deviate from this paradigm and provide a comprehensive study of learning in Stackelberg games.  ...  updates for zero-sum and general-sum games.  ...  In the zero-sum setting, the fact that Nash equilibria are a subset of Stackelberg equilibria for finite games is well-known (Basar & Olsder, 1998) .  ... 
dblp:conf/icml/FiezCR20 fatcat:3gdlurf52rbapprribhmxzssgu

Learning Adversarially Robust Policies in Multi-Agent Games [article]

Eric Zhao, Alexander R. Trott, Caiming Xiong, Stephan Zheng
2022 arXiv   pre-print
We show that worst-case coarse-correlated equilibria can be efficiently approximated in smooth games and propose a framework that uses the worst-case evaluation scheme to learn robust player policies.  ...  In experiments, our framework learns robust policies in repeated N-player matrix games and, when applied to deep multi-agent reinforcement learning, can scale to complex spatiotemporal games.  ...  We prove there are efficient algorithms for adversarially sampling these rational equilibria behaviors in smooth games.  ... 
arXiv:2106.05492v2 fatcat:wmxs4p2oyvebdfhrzwbq5s6xvm

Do GANs always have Nash equilibria?

Farzan Farnia, Asuman E. Ozdaglar
2020 International Conference on Machine Learning  
., the Nash equilibrium of the underlying game. Such issues raise the question of the existence of Nash equilibria in GAN zero-sum games.  ...  Generative adversarial networks (GANs) represent a zero-sum game between two machine players, a generator and a discriminator, designed to learn the distribution of data.  ...  Unlike the traditional approaches to distribution learning, GANs view the learning problem as a zero-sum game between the following two players: 1) generator G aiming to generate real-like samples from  ... 
dblp:conf/icml/FarniaO20 fatcat:ald66rr7vba2fdkzdc45ivwdde

Gradient Descent-Ascent Provably Converges to Strict Local Minmax Equilibria with a Finite Timescale Separation [article]

Tanner Fiez, Lillian Ratliff
2020 arXiv   pre-print
For the parameter choice of τ=1, it is known that the learning dynamics are not guaranteed to converge to a game-theoretically meaningful equilibria in general.  ...  We study the role that a finite timescale separation parameter τ has on gradient descent-ascent in two-player non-convex, non-concave zero-sum games where the learning rate of player 1 is denoted by γ_  ...  Finally, we thank Mescheder et al. (2018) for providing a high quality open source implementation of the generative adversarial network experiments they performed, which facilitated and expedited the  ... 
arXiv:2009.14820v1 fatcat:75yy2bdxdncu5ekg2msf2epvwa

Local Convergence Analysis of Gradient Descent Ascent with Finite Timescale Separation

Tanner Fiez, Lillian J. Ratliff
2021 International Conference on Learning Representations  
We study the role that a finite timescale separation parameter τ has on gradient descent-ascent in non-convex, non-concave zero-sum games where the learning rate of player 1 is denoted by γ 1 and the learning  ...  rate of player 2 is defined to be γ 2 = τ γ 1 .  ...  for the genericity statements regarding local minmax equilibria in zero-sum games).  ... 
dblp:conf/iclr/FiezR21 fatcat:3pqkvsmvivha7eoo2bprilt3vm

A Survey of Decision Making in Adversarial Games [article]

Xiuxian Li, Min Meng, Yiguang Hong, Jie Chen
2022 arXiv   pre-print
Along this line, this paper provides a systematic survey on three main game models widely employed in adversarial games, i.e., zero-sum normal-form and extensive-form games, Stackelberg (security) games  ...  , zero-sum differential games, from an array of perspectives, including basic knowledge of game models, (approximate) equilibrium concepts, problem classifications, research frontiers, (approximate) optimal  ...  In what follows, general Stackelberg games and Stackelberg security games [37] are introduced, where the second one is an important special case of general SGs. General Stackelberg Game (GSG).  ... 
arXiv:2207.07971v1 fatcat:egfm3ha6hrbijcu4rhpanspz2i

Coordinating Followers to Reach Better Equilibria: End-to-End Gradient Descent for Stackelberg Games

Kai Wang, Lily Xu, Andrew Perrault, Michael K. Reiter, Milind Tambe
A growing body of work in game theory extends the traditional Stackelberg game to settings with one leader and multiple followers who play a Nash equilibrium.  ...  Standard approaches for computing equilibria in these games reformulate the followers' best response as constraints in the leader's optimization problem.  ...  The computations in this paper were run on the FASRC Cannon cluster supported by the FAS Division of Science Research Computing Group at Harvard University.  ... 
doi:10.1609/aaai.v36i5.20457 fatcat:gb5p3flggbajbcjdbhwqwkhsmm

On Solving Minimax Optimization Locally: A Follow-the-Ridge Approach [article]

Yuanhao Wang, Guodong Zhang, Jimmy Ba
2019 arXiv   pre-print
Many tasks in modern machine learning can be formulated as finding equilibria in sequential games.  ...  In particular, two-player zero-sum sequential games, also known as minimax optimization, have received growing interest.  ...  In particular, we note that the concept of Nash equilibrium or local Nash does not reflect the order between the min-player and the max-player and may not exist even  ... 
arXiv:1910.07512v2 fatcat:5cmual4pnffxrl6dzzspkmbegm

Bi-Level Actor-Critic for Multi-Agent Coordination

Haifeng Zhang, Weizhe Chen, Zeren Huang, Minne Li, Yaodong Yang, Weinan Zhang, Jun Wang
Under Markov games, we formally define the bi-level reinforcement learning problem in finding Stackelberg equilibrium.  ...  We found that the proposed bi-level actor-critic algorithm successfully converged to the Stackelberg equilibria in matrix games and find a asymmetric solution in a highway merge environment.  ...  Acknowledgments This work is supported by "New Generation of AI 2030" Major Project 2018AAA0100900 and NSFC 61702327, 61632017.  ... 
doi:10.1609/aaai.v34i05.6226 fatcat:y5o2xzjkdrhirbfaq6il2q6mtq

Game of GANs: Game-Theoretical Models for Generative Adversarial Networks [article]

Monireh Mohebbi Moghadam, Bahar Boroomand, Mohammad Jalali, Arman Zareian, Alireza DaeiJavad, Mohammad Hossein Manshaei, Marwan Krunz
2022 arXiv   pre-print
Fundamentally, GAN is a game between two neural networks trained in an adversarial manner to reach a zero-sum Nash equilibrium profile.  ...  This paper reviews the literature on the game theoretic aspects of GANs and addresses how game theory models can address specific challenges of generative model and improve the GAN's performance.  ...  Constant-sum games are two player games in which sum of the two players' utilities is equal to this amount in all other states. When this amount equal to zero, it is called zero-sum game [24] .  ... 
arXiv:2106.06976v3 fatcat:mecyjeopxnesjfj7bcoiim3p3a

Efficient Stackelberg Strategies for Finitely Repeated Games [article]

Eshwar Ram Arunachaleswaran, Natalie Collina, Michael Kearns
2022 arXiv   pre-print
More precisely, we give efficient algorithms for finding approximate Stackelberg equilibria in finite-horizon repeated two-player games, along with rates of convergence depending on the horizon T.  ...  We study the problem of efficiently computing optimal strategies in asymmetric leader-follower games repeated a finite number of times, which presents a different set of technical challenges than the infinite-horizon  ...  However, for general-sum games, it is not sufficient to construct Stackelberg GPAs from learning algorithms.  ... 
arXiv:2207.04192v2 fatcat:gc6vb2l7kjdenp73t6da4s2ifq

Green Power Control in Cognitive Wireless Networks

Mael Le Treust, Samson Lasaulce, Yezekael Hayel, Gaoning He
2013 IEEE Transactions on Vehicular Technology  
This game is shown to be a weighted potential game and its set of equilibria is studied.  ...  The Stackelberg equilibrium analysis of this 2-level hierarchical game is conducted, which allows us to better understand the effects of cognition on energy-efficiency.  ...  Equilibria are indeed less energyefficient (say in terms of sum utility) than the centralized solution.  ... 
doi:10.1109/tvt.2012.2227858 fatcat:72shm7uicvdo3d6tx2x4ctlfty

Learning to Give Checkable Answers with Prover-Verifier Games [article]

Cem Anil, Guodong Zhang, Yuhuai Wu, Roger Grosse
2021 arXiv   pre-print
We analyze variants of the framework, including simultaneous and sequential games, and narrow the space down to a subset of games which provably have the desired equilibria.  ...  We introduce Prover-Verifier Games (PVGs), a game-theoretic framework to encourage learning agents to solve decision problems in a verifiable manner.  ...  Resources used in preparing this research were provided, in part, by the Province of Ontario, the Government of Canada through CIFAR, and companies sponsoring the Vector Institute.  ... 
arXiv:2108.12099v1 fatcat:gi4ym3kodneqpcz2zcn5jqfxhe
« Previous Showing results 1 — 15 out of 552 results