Online Learning with Cumulative Oversampling: Application to Budgeted Influence Maximization [article]

Shatian Wang, Shuoguang Yang, Zhen Xu, Van-Anh Truong
<span title="2020-09-16">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We propose a cumulative oversampling (CO) method for online learning. Our key idea is to sample parameter estimations from the updated belief space once in each round (similar to Thompson Sampling), and utilize the cumulative samples up to the current round to construct optimistic parameter estimations that asymptotically concentrate around the true parameters as tighter upper confidence bounds compared to the ones constructed with standard UCB methods. We apply CO to a novel budgeted variant
more &raquo; ... the Influence Maximization (IM) semi-bandits with linear generalization of edge weights, whose offline problem is NP-hard. Combining CO with the oracle we design for the offline problem, our online learning algorithm simultaneously tackles budget allocation, parameter learning, and reward maximization. We show that for IM semi-bandits, our CO-based algorithm achieves a scaled regret comparable to that of the UCB-based algorithms in theory, and performs on par with Thompson Sampling in numerical experiments.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="">arXiv:2004.11963v3</a> <a target="_blank" rel="external noopener" href="">fatcat:jw5lcpspcbbolcqrgt6tbgbnfa</a> </span>
<a target="_blank" rel="noopener" href="" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] </button> </a> <a target="_blank" rel="external noopener" href="" title=" access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> </button> </a>