Filters








10,101 Hits in 2.5 sec

Real or fake? Natural and artificial social stimuli elicit divergent behavioural and neural responses in mangrove rivulus, Kryptolebias marmoratus

Cheng-Yu Li, Hans A. Hofmann, Melissa L. Harris, Ryan L. Earley
2018 Proceedings of the Royal Society of London. Biological Sciences  
model opponent, regular mirror, non-reversing mirror and live opponent.  ...  of the basolateral amygdala and hippocampus, but lower IEG expression in the preoptic area, than fighting with a non-reversing mirror image or live opponent; (iii) stationary models elicited the least  ...  (LO, live opponent; RMS, regular mirror-image stimulation; NMS, non-reversing mirror-image stimulation; SMO, stationary model opponent).  ... 
doi:10.1098/rspb.2018.1610 pmid:30429304 pmcid:PMC6253381 fatcat:pm672pybzjcc3moc7yalbzv3w4

Efficiently detecting switches against non-stationary opponents

Pablo Hernandez-Leal, Yusen Zhan, Matthew E. Taylor, L. Enrique Sucar, Enrique Munoz de Cote
2016 Autonomous Agents and Multi-Agent Systems  
Moreover, some works also assume the opponent will use a stationary strategy.  ...  This will turn the problem into learning in a non-stationary environment, posing a problem for most learning algorithms.  ...  circles for the non-stationary opponent (NSopponent).  ... 
doi:10.1007/s10458-016-9352-6 fatcat:deyd52hqbjg3xbzqsjl5stgedu

Learning to Model Opponent Learning (Student Abstract)

Ian Davies, Zheng Tian, Jun Wang
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
Most prior works learn an opponent model by assuming the opponent is employing a stationary policy or switching between a set of stationary policies.  ...  The adaptation and learning of other agents induces non-stationarity in the environment dynamics.  ...  Most prior works learn an opponent model by assuming the opponent is employing a stationary policy or switching between a set of stationary policies.  ... 
doi:10.1609/aaai.v34i10.7157 fatcat:i665llqczre65dcxadzbhn3mya

An exploration strategy for non-stationary opponents

Pablo Hernandez-Leal, Yusen Zhan, Matthew E. Taylor, L. Enrique Sucar, Enrique Munoz de Cote
2016 Autonomous Agents and Multi-Agent Systems  
Second, we propose a new algorithm Auton Agent Multi-Agent Syst called R-max# for learning and planning against non-stationary opponent.  ...  R-max# makes efficient use of exploration experiences, which results in rapid adaptation and efficient DE, to deal with the non-stationary nature of the opponent.  ...  Another related approach is MDP4.5 [25] which is designed to learn against switching (non-stationary) opponents. MDP4.5 uses decision trees to learn a model of the opponent.  ... 
doi:10.1007/s10458-016-9347-3 fatcat:ye7swi6bmrh2vpjdf5lspl3xte

A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity [article]

Pablo Hernandez-Leal, Michael Kaisers, Tim Baarslag, Enrique Munoz de Cote
2019 arXiv   pre-print
The key challenge in multiagent learning is learning a best response to the behaviour of other agents, which may be non-stationary: if the other agents adapt their strategy as well, the learning target  ...  This survey presents a coherent overview of work that addresses opponent-induced non-stationarity with tools from game theory, reinforcement learning and multi-armed bandits.  ...  Updating that model (and therefore the policy) is the way to keep up with against a non-stationary opponent.  ... 
arXiv:1707.09183v2 fatcat:mnducjpn7zawpnw3u6wnhhc6k4

Cooperation without Exploitation between Self-interested Agents [chapter]

Steven Damer, Maria Gini
2013 Advances in Intelligent Systems and Computing  
We present experimental results obtained against different types of non-stationary opponents. The results show that a small number of games is sufficient to achieve cooperation.  ...  The agent learns if the opponent is willing to cooperate by tracking the attitude of its opponent, which tells how much the opponent values its own payoff relative to the agent's payoff.  ...  Background on cooperation model We use the model presented in [6] and extend the work in [7] to non-stationary opponents.  ... 
doi:10.1007/978-3-642-33932-5_51 fatcat:zskol54cavhjppdwkye57iflci

Learning to Model Opponent Learning [article]

Ian Davies, Zheng Tian, Jun Wang
2020 arXiv   pre-print
Most prior works learn an opponent model by assuming the opponent is employing a stationary policy or switching between a set of stationary policies.  ...  The adaptation and learning of other agents induces non-stationarity in the environment dynamics.  ...  Most prior works learn an opponent model by assuming the opponent is employing a stationary policy or switching between a set of stationary policies.  ... 
arXiv:2006.03923v1 fatcat:ljhey33ubja75baqfm4f5nyena

Learning to Negotiate Optimally in Non-stationary Environments [chapter]

Vidya Narayanan, Nicholas R. Jennings
2006 Lecture Notes in Computer Science  
In so doing, we present a new framework for adaptive negotiation in such non-stationary environments and develop a novel learning algorithm, which is guaranteed to converge, that an agent can use to negotiate  ...  Specifically, an agent learns the optimal strategy to play against an opponent whose strategy varies with time, assuming no prior information about its negotiation parameters.  ...  Now, to model this process of change in the strategies of the opponent, we use a non-stationary Markov chain.  ... 
doi:10.1007/11839354_21 fatcat:73meulrjhzforigzaobbijs2je

A general criterion and an algorithmic framework for learning in multi-agent systems

Rob Powers, Yoav Shoham, Thuc Vu
2006 Machine Learning  
We then provide several specific instances of the approach: an algorithm for stationary opponents, and two algorithms for adaptive opponents with bounded memory, one algorithm for the n-player case and  ...  This new criterion takes in as a parameter the class of opponents.  ...  For instance, one might be interested in the classes of opponents that can be modelled by finite automata with at most k states; these include both stationary and non-stationary strategies.  ... 
doi:10.1007/s10994-006-9643-2 fatcat:mz4k4iwklnfhxegbrv6prssm3u

Rapid on-line temporal sequence prediction by an adaptive agent

Steven Jensen, Daniel Boley, Maria Gini, Paul Schrater
2005 Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems - AAMAS '05  
We consider the case of near-future event prediction by an online learning agent operating in a non-stationary environment.  ...  We demonstrate the method using both synthetic data and empirical experience from a gameplaying scenario with human opponents.  ...  To measure the adaptability of the ELPH algorithm compared to other methods, we created non-stationary processes by alternating between two mixture models with independent λ values.  ... 
doi:10.1145/1082473.1082484 dblp:conf/atal/JensenBGS05 fatcat:inib7hsksbeepp2rt2jhpxnb4u

Towards a Fast Detection of Opponents in Repeated Stochastic Games [chapter]

Pablo Hernandez-Leal, Michael Kaisers
2017 Lecture Notes in Computer Science  
This article presents a formal model of such sequential interactions, and a corresponding algorithm that combines the two established frameworks Pepper and Bayesian policy reuse.  ...  This allows the agent to quickly select the appropriate policy against the opponent.  ...  Learning in non-stationary environments is another related area since these approaches explicitly model changes in the environment.  ... 
doi:10.1007/978-3-319-71682-4_15 fatcat:ksq7drp2frhblinmbyoyi3kqbm

Towards Efficient Detection and Optimal Response against Sophisticated Opponents [article]

Tianpei Yang and Zhaopeng Meng and Jianye Hao and Chongjie Zhang and Yan Zheng and Ze Zheng
2019 arXiv   pre-print
Previous works usually assume an opponent uses a stationary strategy or randomly switches among several stationary ones.  ...  This paper proposes a novel approach called Bayes-ToMoP which can efficiently detect the strategy of opponents using either stationary or higher-level reasoning strategies.  ...  However, these approaches show no adaptation to non-stationary opponents (Hernandez-Leal et al. 2017a) .  ... 
arXiv:1809.04240v5 fatcat:7hw4355wabgnjljcah34i5epqa

What constitutes a convention? Implications for the coexistence of conventions

Ivar Kolstad
2005 Journal of evolutionary economics  
Sugden (1995) suggests that in such a model, there can be a stationary state of convention coexistence only if interaction is non-uniform across social space.  ...  A model of repeated play of a coordination game, where stage games have a location in social space, and players receive noisy signals of the true location of their games, is reviewed.  ...  The density function ) ( y z e − is continuous, symmetric around a mean of 0, non-decreasing in the interval [ ] 0 , v − and non-increasing in the interval [ ] v , 0 .  ... 
doi:10.1007/s00191-005-0008-y fatcat:4grjsbzs4vasxfdzvuvksvnsri

An epistemic approach to stochastic games

Andrés Perea, Arkadi Predtetchinski
2018 International Journal of Game Theory  
For this setting we study the epistemic concept of common belief in future rationality, which is based on the condition that players always believe that their opponents will choose rationally in the future  ...  That is, for both versions we can always find belief hierarchies that express common belief in future rationality, and that have stationary optimal strategies.  ...  Lemma 5.1 (Stationary strategies are optimal under stationary beliefs) Consider a finite stochastic game . Let s −i be a profile of stationary strategies for i's opponents.  ... 
doi:10.1007/s00182-018-0644-8 fatcat:4wn5abjcf5edjbnqovv6xbxvom

Effective short-term opponent exploitation in simplified poker

Finnegan Southey, Bret Hoehn, Robert C. Holte
2008 Machine Learning  
We explore two approaches to opponent modelling in the context of Kuhn poker, a small game for which game theoretic solutions are known. Parameter estimation and expert algorithms are both studied.  ...  One approach is to find a pessimistic game theoretic solution (i.e. a Nash equilibrium), but human players have idiosyncratic weaknesses that can be exploited if a model of their strategy can be learned  ...  In particular, it assumes a non-stationary opponent that can decide the payoffs in the game at every round.  ... 
doi:10.1007/s10994-008-5091-5 fatcat:rzt466ol5fcxrf2t2siijau6c4
« Previous Showing results 1 — 15 out of 10,101 results