A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private Scheme
[article]
2022
arXiv
pre-print
We study the best-arm identification problem in multi-armed bandits with stochastic, potentially private rewards, when the goal is to identify the arm with the highest quantile at a fixed, prescribed level. First, we propose a (non-private) successive elimination algorithm for strictly optimal best-arm identification, we show that our algorithm is δ-PAC and we characterize its sample complexity. Further, we provide a lower bound on the expected number of pulls, showing that the proposed
arXiv:2006.06792v4
fatcat:yfrangvg5zfuhpgh45vn332juu