A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Entropy-Regularized Stochastic Games
[article]
2019
arXiv
pre-print
We consider both entropy-regularized N-stage and entropy-regularized discounted stochastic games, and establish the existence of a value in both games. ...
In this paper, we introduce entropy-regularized stochastic games where each player aims to maximize the causal entropy of its strategy in addition to its expected payoff. ...
Entropy-Regularized N -Stage Games Searching optimal strategies that solve a stochastic game with the evaluation function Φ N (σ, τ ) in the space of all strategies can be intractable for large N . ...
arXiv:1907.11543v2
fatcat:ofgpr54vafetvm3vsnisd3giem
Learning Nash Equilibria in Zero-Sum Stochastic Games via Entropy-Regularized Policy Approximation
[article]
2021
arXiv
pre-print
We explore the use of policy approximations to reduce the computational cost of learning Nash equilibria in zero-sum stochastic games. ...
We propose a new Q-learning type algorithm that uses a sequence of entropy-regularized soft policies to approximate the Nash policy during the Q-function updates. ...
Entropy-Regularized Policy Approximation Entropy-regularized policy approximation [15] was originally introduced to reduce the maximization bias commonly seen in learning algorithms. ...
arXiv:2009.00162v2
fatcat:wqjvzsq7wneg7lhyzkbrgkmsfy
Learning Nash Equilibria in Zero-Sum Stochastic Games via Entropy-Regularized Policy Approximation
2021
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
unpublished
We explore the use of policy approximations to reduce the computational cost of learning Nash equilibria in zero-sum stochastic games. ...
We prove that under certain conditions, by updating the entropy regularization, the algorithm converges to a Nash equilibrium. ...
Background Two-agent Zero-sum Stochastic Games. ...
doi:10.24963/ijcai.2021/339
fatcat:6m7sewgpr5gfljdcqypd2egjz4
Sample-based Approximation of Nash in Large Many-Player Games via Gradient Descent
[article]
2022
arXiv
pre-print
It accomplishes this by tracing a previously established homotopy that defines a continuum of equilibria for the game regularized with decaying levels of entropy. ...
selection problem of many-player games. ...
entropy regularization too high and risk solving for the Nash of a game we are not interested in. ...
arXiv:2106.01285v3
fatcat:4rw67ctfxzba3iedzab5rwzp7e
q-Munchausen Reinforcement Learning
[article]
2022
arXiv
pre-print
The proposed formulation leads to implicit Tsallis KL regularization under the maximum Tsallis entropy framework. ...
The recently successful Munchausen Reinforcement Learning (M-RL) features implicit Kullback-Leibler (KL) regularization by augmenting the reward function with logarithm of the current stochastic policy ...
RELATED WORK Entropy Regularization. ...
arXiv:2205.07467v1
fatcat:bdfwy3fqtffkxhqtepkb2ny3lq
Page 5275 of Mathematical Reviews Vol. , Issue 85k
[page]
1985
Mathematical Reviews
Game Theory 9 (1980), no. 1, 25-36; MR 81i:90207], and (ii) games of perfect information. ...
Inouye, Yujiro (J-OSAKE-G) 85k:93066 Maximum entropy spectral estimation for regular time series of degenerate rank.
IEEE Trans. Acoust. Speech Signal Process. 32 (1984), no. 4, 733-740. ...
Page 8528 of Mathematical Reviews Vol. , Issue 2001K
[page]
2001
Mathematical Reviews
After defining entropy for nonlinear systems, the problem of minimizing entropy is approached. ...
The author considers the problem of optimal feedback scalar
control associated with a stochastic differential system. ...
Penalty-Regulated Dynamics and Robust Learning Procedures in Games
2015
Mathematics of Operations Research
games. ...
payoffs: in fact, the algorithm retains its robustness in the presence of stochastic perturbations and observation errors, and does not require any synchronization between players. ...
game dynamics (ED), we have: (N, A, u) be a finite game and let h : X → R be a regular entropy function with choice map Q : k R A k → X. ...
doi:10.1287/moor.2014.0687
fatcat:z5asnj7uzjcnreruv2vdjmwzha
Divergence-Regularized Multi-Agent Actor-Critic
[article]
2021
arXiv
pre-print
Entropy regularization is a popular method in reinforcement learning (RL). ...
We evaluate DMAC in a didactic stochastic game and StarCraft Multi-Agent Challenge and empirically show that DMAC substantially improves the performance of existing MARL algorithms. ...
We also show the benefit of exploration in this stochastic game for the convenience of statistics. ...
arXiv:2110.00304v1
fatcat:4p4cz3lf3vc3phfcsi7sjqojza
Generative Actor-Critic: An Off-policy Algorithm Using the Push-forward Model
[article]
2021
arXiv
pre-print
technique, MMD-entropy regularizer, to balance the exploration and exploitation. ...
Model-free deep reinforcement learning has achieved great success in many domains, such as video games, recommendation systems and robotic control tasks. ...
To show the influence of the MMD-entropy regularizer on the stochasticity of policies, we compare the action distributions of the policies learned from GAC without the regularizer and GAC with adaptive ...
arXiv:2105.03733v2
fatcat:vjmbl4zuhfdnvme64iwvgjsgly
Page 1605 of Mathematical Reviews Vol. , Issue 2004b
[page]
2004
Mathematical Reviews
Let U(h) denote the max min of the one-shot game in which player one is restricted to mixed actions of entropy at most /. ...
(English summary)
Stochastic equilibrium problems in economics and game theory. Ann. Oper. Res. 114 (2002), 33-38. ...
Entropy Minimization In Emergent Languages
[article]
2020
arXiv
pre-print
Further, we observe that stronger discrete-channel-driven entropy minimization leads to representations with increased robustness to overfitting and adversarial attacks. ...
We find that, under common training procedures, the emergent languages are subject to an entropy minimization pressure that has also been detected in human language, whereby the mutual information between ...
In Figures 1c & 1d , we report H(m) when training with Stochastic Graph Optimization and REINFORCE across degrees of entropy regularization. ...
arXiv:1905.13687v3
fatcat:dugglhpaofgdrmagoxsfhevwpm
Entropy Regularization for Mean Field Games with Learning
[article]
2021
arXiv
pre-print
This paper analyzes both quantitatively and qualitatively the impact of entropy regularization for Mean Field Game (MFG) with learning in a finite time horizon. ...
Our study provides a theoretical justification that entropy regularization yields time-dependent policies and, furthermore, helps stabilizing and accelerating convergence to the game equilibrium. ...
Game payoff with entropy regularization. ...
arXiv:2010.00145v2
fatcat:yjzvae3bn5bbvmty4ozmmot7iq
Effective Exploration for Deep Reinforcement Learning via Bootstrapped Q-Ensembles under Tsallis Entropy Regularization
[article]
2018
arXiv
pre-print
Specifically, a general form of Tsallis entropy regularizer will be utilized to drive entropy-induced exploration based on efficient approximation of optimal action-selection policies. ...
Meanwhile, by employing an ensemble of Q-networks under varied Tsallis entropy regularization, the diversity of the ensemble can be further enhanced to enable effective bootstrap-induced exploration. ...
human game players. ...
arXiv:1809.00403v2
fatcat:kgl7sktmpbgghfns4g6ewfrtdy
Page 1202 of Mathematical Reviews Vol. , Issue 92b
[page]
1992
Mathematical Reviews
(PL-WROCT) Semicontinuous nonstationary stochastic games. II. J. Math. Anal. Appl. 148 (1990), no. 1, 22-43. ...
Summary: “We introduce a Borel space framework for zero-sum discrete-time stochastic games, that is, a game-theoretic extension of some nonstationary dynamic programming models in the sense of Hinderer ...
« Previous
Showing results 1 — 15 out of 8,978 results