A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Real or fake? Natural and artificial social stimuli elicit divergent behavioural and neural responses in mangrove rivulus, Kryptolebias marmoratus
2018
Proceedings of the Royal Society of London. Biological Sciences
model opponent, regular mirror, non-reversing mirror and live opponent. ...
of the basolateral amygdala and hippocampus, but lower IEG expression in the preoptic area, than fighting with a non-reversing mirror image or live opponent; (iii) stationary models elicited the least ...
(LO, live opponent; RMS, regular mirror-image stimulation; NMS, non-reversing mirror-image stimulation; SMO, stationary model opponent). ...
doi:10.1098/rspb.2018.1610
pmid:30429304
pmcid:PMC6253381
fatcat:pm672pybzjcc3moc7yalbzv3w4
Efficiently detecting switches against non-stationary opponents
2016
Autonomous Agents and Multi-Agent Systems
Moreover, some works also assume the opponent will use a stationary strategy. ...
This will turn the problem into learning in a non-stationary environment, posing a problem for most learning algorithms. ...
circles for the non-stationary opponent (NSopponent). ...
doi:10.1007/s10458-016-9352-6
fatcat:deyd52hqbjg3xbzqsjl5stgedu
Learning to Model Opponent Learning (Student Abstract)
2020
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
Most prior works learn an opponent model by assuming the opponent is employing a stationary policy or switching between a set of stationary policies. ...
The adaptation and learning of other agents induces non-stationarity in the environment dynamics. ...
Most prior works learn an opponent model by assuming the opponent is employing a stationary policy or switching between a set of stationary policies. ...
doi:10.1609/aaai.v34i10.7157
fatcat:i665llqczre65dcxadzbhn3mya
An exploration strategy for non-stationary opponents
2016
Autonomous Agents and Multi-Agent Systems
Second, we propose a new algorithm Auton Agent Multi-Agent Syst called R-max# for learning and planning against non-stationary opponent. ...
R-max# makes efficient use of exploration experiences, which results in rapid adaptation and efficient DE, to deal with the non-stationary nature of the opponent. ...
Another related approach is MDP4.5 [25] which is designed to learn against switching (non-stationary) opponents. MDP4.5 uses decision trees to learn a model of the opponent. ...
doi:10.1007/s10458-016-9347-3
fatcat:ye7swi6bmrh2vpjdf5lspl3xte
A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity
[article]
2019
arXiv
pre-print
The key challenge in multiagent learning is learning a best response to the behaviour of other agents, which may be non-stationary: if the other agents adapt their strategy as well, the learning target ...
This survey presents a coherent overview of work that addresses opponent-induced non-stationarity with tools from game theory, reinforcement learning and multi-armed bandits. ...
Updating that model (and therefore the policy) is the way to keep up with against a non-stationary opponent. ...
arXiv:1707.09183v2
fatcat:mnducjpn7zawpnw3u6wnhhc6k4
Cooperation without Exploitation between Self-interested Agents
[chapter]
2013
Advances in Intelligent Systems and Computing
We present experimental results obtained against different types of non-stationary opponents. The results show that a small number of games is sufficient to achieve cooperation. ...
The agent learns if the opponent is willing to cooperate by tracking the attitude of its opponent, which tells how much the opponent values its own payoff relative to the agent's payoff. ...
Background on cooperation model We use the model presented in [6] and extend the work in [7] to non-stationary opponents. ...
doi:10.1007/978-3-642-33932-5_51
fatcat:zskol54cavhjppdwkye57iflci
Learning to Model Opponent Learning
[article]
2020
arXiv
pre-print
Most prior works learn an opponent model by assuming the opponent is employing a stationary policy or switching between a set of stationary policies. ...
The adaptation and learning of other agents induces non-stationarity in the environment dynamics. ...
Most prior works learn an opponent model by assuming the opponent is employing a stationary policy or switching between a set of stationary policies. ...
arXiv:2006.03923v1
fatcat:ljhey33ubja75baqfm4f5nyena
Learning to Negotiate Optimally in Non-stationary Environments
[chapter]
2006
Lecture Notes in Computer Science
In so doing, we present a new framework for adaptive negotiation in such non-stationary environments and develop a novel learning algorithm, which is guaranteed to converge, that an agent can use to negotiate ...
Specifically, an agent learns the optimal strategy to play against an opponent whose strategy varies with time, assuming no prior information about its negotiation parameters. ...
Now, to model this process of change in the strategies of the opponent, we use a non-stationary Markov chain. ...
doi:10.1007/11839354_21
fatcat:73meulrjhzforigzaobbijs2je
A general criterion and an algorithmic framework for learning in multi-agent systems
2006
Machine Learning
We then provide several specific instances of the approach: an algorithm for stationary opponents, and two algorithms for adaptive opponents with bounded memory, one algorithm for the n-player case and ...
This new criterion takes in as a parameter the class of opponents. ...
For instance, one might be interested in the classes of opponents that can be modelled by finite automata with at most k states; these include both stationary and non-stationary strategies. ...
doi:10.1007/s10994-006-9643-2
fatcat:mz4k4iwklnfhxegbrv6prssm3u
Rapid on-line temporal sequence prediction by an adaptive agent
2005
Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems - AAMAS '05
We consider the case of near-future event prediction by an online learning agent operating in a non-stationary environment. ...
We demonstrate the method using both synthetic data and empirical experience from a gameplaying scenario with human opponents. ...
To measure the adaptability of the ELPH algorithm compared to other methods, we created non-stationary processes by alternating between two mixture models with independent λ values. ...
doi:10.1145/1082473.1082484
dblp:conf/atal/JensenBGS05
fatcat:inib7hsksbeepp2rt2jhpxnb4u
Towards a Fast Detection of Opponents in Repeated Stochastic Games
[chapter]
2017
Lecture Notes in Computer Science
This article presents a formal model of such sequential interactions, and a corresponding algorithm that combines the two established frameworks Pepper and Bayesian policy reuse. ...
This allows the agent to quickly select the appropriate policy against the opponent. ...
Learning in non-stationary environments is another related area since these approaches explicitly model changes in the environment. ...
doi:10.1007/978-3-319-71682-4_15
fatcat:ksq7drp2frhblinmbyoyi3kqbm
Towards Efficient Detection and Optimal Response against Sophisticated Opponents
[article]
2019
arXiv
pre-print
Previous works usually assume an opponent uses a stationary strategy or randomly switches among several stationary ones. ...
This paper proposes a novel approach called Bayes-ToMoP which can efficiently detect the strategy of opponents using either stationary or higher-level reasoning strategies. ...
However, these approaches show no adaptation to non-stationary opponents (Hernandez-Leal et al. 2017a) . ...
arXiv:1809.04240v5
fatcat:7hw4355wabgnjljcah34i5epqa
What constitutes a convention? Implications for the coexistence of conventions
2005
Journal of evolutionary economics
Sugden (1995) suggests that in such a model, there can be a stationary state of convention coexistence only if interaction is non-uniform across social space. ...
A model of repeated play of a coordination game, where stage games have a location in social space, and players receive noisy signals of the true location of their games, is reviewed. ...
The density function ) ( y z e − is continuous, symmetric around a mean of 0, non-decreasing in the interval [ ] 0 , v − and non-increasing in the interval [ ] v , 0 . ...
doi:10.1007/s00191-005-0008-y
fatcat:4grjsbzs4vasxfdzvuvksvnsri
An epistemic approach to stochastic games
2018
International Journal of Game Theory
For this setting we study the epistemic concept of common belief in future rationality, which is based on the condition that players always believe that their opponents will choose rationally in the future ...
That is, for both versions we can always find belief hierarchies that express common belief in future rationality, and that have stationary optimal strategies. ...
Lemma 5.1 (Stationary strategies are optimal under stationary beliefs) Consider a finite stochastic game . Let s −i be a profile of stationary strategies for i's opponents. ...
doi:10.1007/s00182-018-0644-8
fatcat:4wn5abjcf5edjbnqovv6xbxvom
Effective short-term opponent exploitation in simplified poker
2008
Machine Learning
We explore two approaches to opponent modelling in the context of Kuhn poker, a small game for which game theoretic solutions are known. Parameter estimation and expert algorithms are both studied. ...
One approach is to find a pessimistic game theoretic solution (i.e. a Nash equilibrium), but human players have idiosyncratic weaknesses that can be exploited if a model of their strategy can be learned ...
In particular, it assumes a non-stationary opponent that can decide the payoffs in the game at every round. ...
doi:10.1007/s10994-008-5091-5
fatcat:rzt466ol5fcxrf2t2siijau6c4
« Previous
Showing results 1 — 15 out of 10,101 results