Online Learning for Non-Stationary A/B Tests [article]

Andrés Muñoz Medina, Sergei Vassilvitskii, Dong Yin
2018 arXiv   pre-print
The rollout of new versions of a feature in modern applications is a manual multi-stage process, as the feature is released to ever larger groups of users, while its performance is carefully monitored. This kind of A/B testing is ubiquitous, but suboptimal, as the monitoring requires heavy human intervention, is not guaranteed to capture consistent, but short-term fluctuations in performance, and is inefficient, as better versions take a long time to reach the full population. In this work we
more » ... rmulate this question as that of expert learning, and give a new algorithm Follow-The-Best-Interval, FTBI, that works in dynamic, non-stationary environments. Our approach is practical, simple, and efficient, and has rigorous guarantees on its performance. Finally, we perform a thorough evaluation on synthetic and real world datasets and show that our approach outperforms current state-of-the-art methods.
arXiv:1802.05315v2 fatcat:tvxy25qsg5apbdmp7ynmdb4gji