A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Exploration, Exploitation, and Engagement in Multi-Armed Bandits with Abandonment
[article]
2022
arXiv
pre-print
Multi-armed bandit (MAB) is a classic model for understanding the exploration-exploitation trade-off. The traditional MAB model for recommendation systems assumes the user stays in the system for the entire learning horizon. In new online education platforms such as ALEKS or new video recommendation systems such as TikTok and YouTube Shorts, the amount of time a user spends on the app depends on how engaging the recommended contents are. Users may temporarily leave the system if the recommended
arXiv:2205.13566v1
fatcat:ngve5ruj7zgpplaxmhjqqlb52a