Filters








8,978 Hits in 2.9 sec

Entropy-Regularized Stochastic Games [article]

Yagiz Savas, Mohamadreza Ahmadi, Takashi Tanaka, Ufuk Topcu
2019 arXiv   pre-print
We consider both entropy-regularized N-stage and entropy-regularized discounted stochastic games, and establish the existence of a value in both games.  ...  In this paper, we introduce entropy-regularized stochastic games where each player aims to maximize the causal entropy of its strategy in addition to its expected payoff.  ...  Entropy-Regularized N -Stage Games Searching optimal strategies that solve a stochastic game with the evaluation function Φ N (σ, τ ) in the space of all strategies can be intractable for large N .  ... 
arXiv:1907.11543v2 fatcat:ofgpr54vafetvm3vsnisd3giem

Learning Nash Equilibria in Zero-Sum Stochastic Games via Entropy-Regularized Policy Approximation [article]

Yue Guan, Qifan Zhang, Panagiotis Tsiotras
2021 arXiv   pre-print
We explore the use of policy approximations to reduce the computational cost of learning Nash equilibria in zero-sum stochastic games.  ...  We propose a new Q-learning type algorithm that uses a sequence of entropy-regularized soft policies to approximate the Nash policy during the Q-function updates.  ...  Entropy-Regularized Policy Approximation Entropy-regularized policy approximation [15] was originally introduced to reduce the maximization bias commonly seen in learning algorithms.  ... 
arXiv:2009.00162v2 fatcat:wqjvzsq7wneg7lhyzkbrgkmsfy

Learning Nash Equilibria in Zero-Sum Stochastic Games via Entropy-Regularized Policy Approximation

Yue Guan, Qifan Zhang, Panagiotis Tsiotras
2021 Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence   unpublished
We explore the use of policy approximations to reduce the computational cost of learning Nash equilibria in zero-sum stochastic games.  ...  We prove that under certain conditions, by updating the entropy regularization, the algorithm converges to a Nash equilibrium.  ...  Background Two-agent Zero-sum Stochastic Games.  ... 
doi:10.24963/ijcai.2021/339 fatcat:6m7sewgpr5gfljdcqypd2egjz4

Sample-based Approximation of Nash in Large Many-Player Games via Gradient Descent [article]

Ian Gemp, Rahul Savani, Marc Lanctot, Yoram Bachrach, Thomas Anthony, Richard Everett, Andrea Tacchetti, Tom Eccles, János Kramár
2022 arXiv   pre-print
It accomplishes this by tracing a previously established homotopy that defines a continuum of equilibria for the game regularized with decaying levels of entropy.  ...  selection problem of many-player games.  ...  entropy regularization too high and risk solving for the Nash of a game we are not interested in.  ... 
arXiv:2106.01285v3 fatcat:4rw67ctfxzba3iedzab5rwzp7e

q-Munchausen Reinforcement Learning [article]

Lingwei Zhu, Zheng Chen, Eiji Uchibe, Takamitsu Matsubara
2022 arXiv   pre-print
The proposed formulation leads to implicit Tsallis KL regularization under the maximum Tsallis entropy framework.  ...  The recently successful Munchausen Reinforcement Learning (M-RL) features implicit Kullback-Leibler (KL) regularization by augmenting the reward function with logarithm of the current stochastic policy  ...  RELATED WORK Entropy Regularization.  ... 
arXiv:2205.07467v1 fatcat:bdfwy3fqtffkxhqtepkb2ny3lq

Page 5275 of Mathematical Reviews Vol. , Issue 85k [page]

1985 Mathematical Reviews  
Game Theory 9 (1980), no. 1, 25-36; MR 81i:90207], and (ii) games of perfect information.  ...  Inouye, Yujiro (J-OSAKE-G) 85k:93066 Maximum entropy spectral estimation for regular time series of degenerate rank. IEEE Trans. Acoust. Speech Signal Process. 32 (1984), no. 4, 733-740.  ... 

Page 8528 of Mathematical Reviews Vol. , Issue 2001K [page]

2001 Mathematical Reviews  
After defining entropy for nonlinear systems, the problem of minimizing entropy is approached.  ...  The author considers the problem of optimal feedback scalar control associated with a stochastic differential system.  ... 

Penalty-Regulated Dynamics and Robust Learning Procedures in Games

Pierre Coucheney, Bruno Gaujal, Panayotis Mertikopoulos
2015 Mathematics of Operations Research  
games.  ...  payoffs: in fact, the algorithm retains its robustness in the presence of stochastic perturbations and observation errors, and does not require any synchronization between players.  ...  game dynamics (ED), we have: (N, A, u) be a finite game and let h : X → R be a regular entropy function with choice map Q : k R A k → X.  ... 
doi:10.1287/moor.2014.0687 fatcat:z5asnj7uzjcnreruv2vdjmwzha

Divergence-Regularized Multi-Agent Actor-Critic [article]

Kefan Su, Zongqing Lu
2021 arXiv   pre-print
Entropy regularization is a popular method in reinforcement learning (RL).  ...  We evaluate DMAC in a didactic stochastic game and StarCraft Multi-Agent Challenge and empirically show that DMAC substantially improves the performance of existing MARL algorithms.  ...  We also show the benefit of exploration in this stochastic game for the convenience of statistics.  ... 
arXiv:2110.00304v1 fatcat:4p4cz3lf3vc3phfcsi7sjqojza

Generative Actor-Critic: An Off-policy Algorithm Using the Push-forward Model [article]

Lingwei Peng, Hui Qian, Zhebang Shen, Chao Zhang, Fei Li
2021 arXiv   pre-print
technique, MMD-entropy regularizer, to balance the exploration and exploitation.  ...  Model-free deep reinforcement learning has achieved great success in many domains, such as video games, recommendation systems and robotic control tasks.  ...  To show the influence of the MMD-entropy regularizer on the stochasticity of policies, we compare the action distributions of the policies learned from GAC without the regularizer and GAC with adaptive  ... 
arXiv:2105.03733v2 fatcat:vjmbl4zuhfdnvme64iwvgjsgly

Page 1605 of Mathematical Reviews Vol. , Issue 2004b [page]

2004 Mathematical Reviews  
Let U(h) denote the max min of the one-shot game in which player one is restricted to mixed actions of entropy at most /.  ...  (English summary) Stochastic equilibrium problems in economics and game theory. Ann. Oper. Res. 114 (2002), 33-38.  ... 

Entropy Minimization In Emergent Languages [article]

Eugene Kharitonov and Rahma Chaabouni and Diane Bouchacourt and Marco Baroni
2020 arXiv   pre-print
Further, we observe that stronger discrete-channel-driven entropy minimization leads to representations with increased robustness to overfitting and adversarial attacks.  ...  We find that, under common training procedures, the emergent languages are subject to an entropy minimization pressure that has also been detected in human language, whereby the mutual information between  ...  In Figures 1c & 1d , we report H(m) when training with Stochastic Graph Optimization and REINFORCE across degrees of entropy regularization.  ... 
arXiv:1905.13687v3 fatcat:dugglhpaofgdrmagoxsfhevwpm

Entropy Regularization for Mean Field Games with Learning [article]

Xin Guo, Renyuan Xu, Thaleia Zariphopoulou
2021 arXiv   pre-print
This paper analyzes both quantitatively and qualitatively the impact of entropy regularization for Mean Field Game (MFG) with learning in a finite time horizon.  ...  Our study provides a theoretical justification that entropy regularization yields time-dependent policies and, furthermore, helps stabilizing and accelerating convergence to the game equilibrium.  ...  Game payoff with entropy regularization.  ... 
arXiv:2010.00145v2 fatcat:yjzvae3bn5bbvmty4ozmmot7iq

Effective Exploration for Deep Reinforcement Learning via Bootstrapped Q-Ensembles under Tsallis Entropy Regularization [article]

Gang Chen and Yiming Peng and Mengjie Zhang
2018 arXiv   pre-print
Specifically, a general form of Tsallis entropy regularizer will be utilized to drive entropy-induced exploration based on efficient approximation of optimal action-selection policies.  ...  Meanwhile, by employing an ensemble of Q-networks under varied Tsallis entropy regularization, the diversity of the ensemble can be further enhanced to enable effective bootstrap-induced exploration.  ...  human game players.  ... 
arXiv:1809.00403v2 fatcat:kgl7sktmpbgghfns4g6ewfrtdy

Page 1202 of Mathematical Reviews Vol. , Issue 92b [page]

1992 Mathematical Reviews  
(PL-WROCT) Semicontinuous nonstationary stochastic games. II. J. Math. Anal. Appl. 148 (1990), no. 1, 22-43.  ...  Summary: “We introduce a Borel space framework for zero-sum discrete-time stochastic games, that is, a game-theoretic extension of some nonstationary dynamic programming models in the sense of Hinderer  ... 
« Previous Showing results 1 — 15 out of 8,978 results