A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is
Multi-armed bandit problems concern decision making when selecting a slot machine among many slot machines with initially uncertain hit probabilities to maximize the total reward; this is a fundamental problem of reinforcement learning. Furthermore, competitive multi-armed bandit problems involve multiple agents in play, manifesting fundamental concerns regarding social figures, not just individual rewards. A representative issue is selection conflict, in which multiple players select the samedoi:10.1587/nolta.13.582 fatcat:rdrtda6smzgdbgv6baymqlgw2m