A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Fuzzy Theory Based Single Belief State Generation for Partially Observable Real-time Strategy Games
2019
IEEE Access
Therefore, this paper proposes a fuzzy theory-based single belief state generation method named FTH to do what based on multi-layer information sets extracted from the history position information. ...
As the basic problem of the real-time strategy (RTS) games, AI planning has attracted wide attention of researchers, but it still remains as a huge challenge due to its large searching space and realtime ...
ACKNOWLEDGMENT We want to thank Quanjun Yin and Qi Zhang for the discussion about partially observable environment and optimization algorithm design. And thank Yanqing Ye for the draft editing. ...
doi:10.1109/access.2019.2923419
fatcat:gknjfm2lwbfivlpoldg3fgdaiy
Multiple Tree for Partially Observable Monte-Carlo Tree Search
[article]
2011
arXiv
pre-print
We propose an algorithm for computing approximate Nash equilibria of partially observable games using Monte-Carlo tree search based on recent bandit methods. ...
We obtain experimental results for the game of phantom tic-tac-toe, showing that strong strategies can be efficiently computed by our algorithm. ...
on the real state of the game. ...
arXiv:1102.1580v1
fatcat:j37pquo465dwhcgcnotuvi7ugm
Multiple Tree for Partially Observable Monte-Carlo Tree Search
[chapter]
2011
Lecture Notes in Computer Science
We propose an algorithm for computing approximate Nash equilibria of partially observable games using Monte-Carlo tree search based on recent bandit methods. ...
We obtain experimental results for the game of phantom tic-tac-toe, showing that strong strategies can be efficiently computed by our algorithm. ...
on the real state of the game. ...
doi:10.1007/978-3-642-20525-5_6
fatcat:g2pqzhnnwbd5pny42wwoxb3sta
On Improving Deep Reinforcement Learning for POMDPs
[article]
2018
arXiv
pre-print
The time series of action-observation pairs are then integrated by an LSTM layer that learns latent states based on which a fully connected layer computes Q-values as in conventional Deep Q-Networks (DQNs ...
We demonstrate the effectiveness of our new architecture in several partially observable domains, including flickering Atari games. ...
We have demonstrated the effectiveness of our proposed approach in several POMDP problems in comparison to the state-of-the-art approaches. ...
arXiv:1704.07978v6
fatcat:hjar7q5p6bd6vfu7pf75h5p5gu
Solving Partially Observable Stochastic Games with Public Observations
2019
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
Partially observable stochastic games (POSGs) are among the most general formal models that capture such dynamic scenarios. ...
We propose such a subclass for two-player zero-sum games with discounted-sum objective function—POSGs with public observations (POPOSGs)—where each player is able to reconstruct beliefs of the other player ...
Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on. ...
doi:10.1609/aaai.v33i01.33012029
fatcat:v5divnvzyvaoxf5mgp2d5xx5gy
Decision Making in Complex Multiagent Contexts: A Tale of Two Frameworks
2012
The AI Magazine
I put the two frameworks, decentralized partially observable Markov decision process (Dec-POMDP) and the interactive partially observable Markov decision process (I-POMDP), in context and review the foundational ...
algorithms for these frameworks, while briefly discussing the advances in their specializations. ...
This is analogous to iterated elimination of very weakly dominated behavioral strategies -a well-known technique for compacting games -in the context of partially observable stochastic games. ...
doi:10.1609/aimag.v33i4.2402
fatcat:peqlr3rr5bghffao6zjowl7amy
On Improving Deep Reinforcement Learning for POMDPs
[article]
2018
arXiv
pre-print
The time series of action-observation pairs are then integrated by an LSTM layer that learns latent states based on which a fully connected layer computes Q-values as in conventional Deep Q-Networks (DQNs ...
We demonstrate the effectiveness of our new architecture in several partially observable domains, including flickering Atari games. ...
We have demonstrated the effectiveness of our proposed approach in several POMDP problems in comparison to the state-of-the-art approaches. ...
arXiv:1804.06309v2
fatcat:edqab5pgmvgwfjkd3qwevs5drm
Robust Opponent Modeling via Adversarial Ensemble Reinforcement Learning in Asymmetric Imperfect-Information Games
[article]
2020
arXiv
pre-print
This paper presents an algorithmic framework for learning robust policies in asymmetric imperfect-information games, where the joint reward could depend on the uncertain opponent type (a private information ...
We use multiagent reinforcement learning (MARL) to learn opponent models through self-play, which captures the full strategy interaction and reasoning between agents. ...
We summarize the key findings of this work as follow: • We propose algorithms based on MARL and ensemble training for robust opponent modeling and posterior inference over the opponent type from the observed ...
arXiv:1909.08735v4
fatcat:ne2qkof3lvf7rdjww4xhtdsjhi
Deceptive Kernel Function on Observations of Discrete POMDP
[article]
2020
arXiv
pre-print
Based on value iteration, value function approximation and POMCP three characteristic algorithms used by agent, we analyze its belief being misled by falsified observations as the kernel's outputs and ...
This paper studies the deception applied on agent in a partially observable Markov decision process. ...
Within the field of cybersecurity, deception as a general strategy has been discussed frequently in the game-theoretic framework from the defender perspective. ...
arXiv:2008.05585v1
fatcat:uyl3ryghuva5dfox3gfp2j7r6a
A Model-Based, Decision-Theoretic Perspective on Automated Cyber Response
[article]
2020
arXiv
pre-print
We combine a simulation of the system to be defended with an anytime online planner to solve cyber defense problems characterized as partially observable Markov decision problems (POMDPs). ...
Cyber-attacks can occur at machine speeds that are far too fast for human-in-the-loop (or sometimes on-the-loop) decision making to be a viable option. ...
One way to account for these issues is to address the cyber response problem directly as a partially observable stochastic game (e.g. as a partially observable competitive Markov decision process (Zonouz ...
arXiv:2002.08957v1
fatcat:rakhpiufdve4rorxi5o45rhlqy
Resilience of LTE eNode B against smart jammer in infinite-horizon asymmetric repeated zero-sum game
2020
Physical Communication
Smart jammer (informed player) uses its evolving belief state as the fixed-sized sufficient statistic for the repeated game. ...
Hence, the problem is convexified by devising suboptimal security strategies with guaranteed performance for both players that are based on approximated optimal game value. ...
In a more general setting, informed player may decide to reveal no information, partial information, or complete information based on its payoff model to exploit the situation for its own benefit. ...
doi:10.1016/j.phycom.2019.100989
fatcat:7morvf5dobbfzmxfv4e6vrgava
Improving Policies via Search in Cooperative Partially Observable Games
[article]
2019
arXiv
pre-print
However, just like humans, real-world AI systems have to coordinate and communicate with other agents in cooperative partially observable environments as well. ...
In this paper we propose two different search techniques that can be applied to improve an arbitrary agreed-upon policy in a cooperative partially observable game. ...
Acknowledgments We would like to thank Pratik Ringshia for developing user interfaces used to interact with Hanabi agents. ...
arXiv:1912.02318v1
fatcat:3g56lo36bbaofa2zoggd3xan2e
A Reinforcement Learning Scheme for a Partially-Observable Multi-Agent Game
2005
Machine Learning
The problem can approximately be dealt with in the framework of a partially observable Markov decision process (POMDP) for a single-agent system. ...
We formulate an automatic strategy acquisition problem for the multi-agent card game "Hearts" as a reinforcement learning problem. ...
This study was partly supported by Grant-in-Aid for Scientific Research (B) (No. 16014214) from Japan Society for the Promotion of Science. ...
doi:10.1007/s10994-005-0461-8
fatcat:rp7oo5rj3nf55exvfqjnt5nb5u
Improving Policies via Search in Cooperative Partially Observable Games
2020
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
However, just like humans, real-world AI systems have to coordinate and communicate with other agents in cooperative partially observable environments as well. ...
In this paper we propose two different search techniques that can be applied to improve an arbitrary agreed-upon policy in a cooperative partially observable game. ...
We use τ t = {s 0 , a 0 , r 0 , ...s t } to denote the game history (or 'trajectory') at time t. ...
doi:10.1609/aaai.v34i05.6208
fatcat:2uyfzliewzgzpccpazwcmtzywi
Approximating n-player behavioural strategy nash equilibria using coevolution
2011
Proceedings of the 13th annual conference on Genetic and evolutionary computation - GECCO '11
In this paper we propose a coevolutionary algorithm that approximates behavioural strategy Nash equilibria in n-player zero sum games, by exploiting the minimax solution concept. ...
In order to support our case we provide a set of experiments in both games of known and unknown equilibria. ...
Partially Observable Markov Decision Process A Partially Observable MDP (POMDP) [15] is described by a tuple P = S, A, T, f, O, N, b0 . ...
doi:10.1145/2001576.2001726
dblp:conf/gecco/SamothrakisL11
fatcat:hcw7pblre5esjaaayzrjcwv7ta
« Previous
Showing results 1 — 15 out of 50,886 results