A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
OPtions as REsponses: Grounding behavioural hierarchies in multi-agent reinforcement learning
2020
International Conference on Machine Learning
This paper investigates generalisation in multiagent games, where the generality of the agent can be evaluated by playing against opponents it hasn't seen during training. We propose two new games with concealed information and complex, non-transitive reward structure (think rock/paper/scissors). It turns out that most current deep reinforcement learning methods fail to efficiently explore the strategy space, thus learning policies that generalise poorly to unseen opponents. We then propose a
dblp:conf/icml/VezhnevetsWELL20
fatcat:m5o3fkh5ona7hhuz756lz53ixe