A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Adaptive Best-of-Both-Worlds Algorithm for Heavy-Tailed Multi-Armed Bandits
[article]
2022
arXiv
pre-print
In this paper, we generalize the concept of heavy-tailed multi-armed bandits to adversarial environments, and develop robust best-of-both-worlds algorithms for heavy-tailed multi-armed bandits (MAB), where losses have α-th (1<α≤ 2) moments bounded by σ^α, while the variances may not exist. Specifically, we design an algorithm , when the heavy-tail parameters α and σ are known to the agent, simultaneously achieves the optimal regret for both stochastic and adversarial environments, without
arXiv:2201.11921v2
fatcat:24fbsknv6reuriueuep4cexzfi