A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Contextual User Browsing Bandits for Large-Scale Online Mobile Recommendation
[article]
2020
arXiv
pre-print
Second, we propose a novel contextual combinatorial bandit method called UBM-LinUCB to address two issues related to positions by adopting the User Browsing Model (UBM), a click model for web search. ...
Results on two CTR metrics show that our algorithm outperforms the other contextual bandit algorithms. ...
DCM-LinUCB is an contextual bandit algorithm based on Dependent Click Model. ...
arXiv:2008.09368v1
fatcat:ua4aqkhhrvhb3lvljdvnp5pi3q
Automatic ad format selection via contextual bandits
2013
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management - CIKM '13
To balance exploration with exploitation, we pose automatic layout selection as a contextual bandit problem. There are many bandit algorithms, each generating a policy which must be evaluated. ...
We describe the development of our offline replayer, and benchmark a number of common bandit algorithms. ...
In this context, [7] presented an approach for modeling click response for different ad arrangement templates on a web-page. ...
doi:10.1145/2505515.2514700
dblp:conf/cikm/TangRSA13
fatcat:pd5ypbxtanb23otit7teyikpci
Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms
2011
Proceedings of the fourth ACM international conference on Web search and data mining - WSDM '11
Contextual bandit algorithms have become popular for online recommendation systems such as Digg, Yahoo! Buzz, and news recommendation in general. ...
In this paper, we introduce a replay methodology for contextual bandit algorithm evaluation. ...
INTRODUCTION Web-based content recommendation services such as Digg, Yahoo! Buzz and Yahoo! ...
doi:10.1145/1935826.1935878
dblp:conf/wsdm/LiCLW11
fatcat:5c64oezojfdx3prp7hhb5whkty
Bandit Algorithms in Information Retrieval
2019
Foundations and Trends in Information Retrieval
Dorota Głowacka (2019), "Bandit Algorithms in Information Retrieval", Foundations and Trends R in Information Retrieval: Vol. 13, No. 4, pp 299-424. DOI: 10.1561/1500000067. ...
Chapter 3 summarizes bandit algorithms inspired by three click models: the Cascade Model (Section 3.1), the Dependent Click Model (Section 3.2) and the Position Based Model 3.3. ...
"A contextual-bandit algorithm for mobile context-aware recommender system". In: International Conference on Neural Information Processing. Springer. 324-331. Bresler, G., G. H. Chen, and D. ...
doi:10.1561/1500000067
fatcat:api5ljs5abbwdckujtsgwp27o4
Offline Evaluation and Optimization for Interactive Systems
2015
Proceedings of the Eighth ACM International Conference on Web Search and Data Mining - WSDM '15
data a warm-start policy better than random • Non-exploration data • 35M impressions for training • 19M impressions for test • 880K ads • 3.4M distinct webpages • ∈ {0,1}: click or not Three Algorithms ...
to measure success
Click metrics are hard to work
with offline
(b/c counterfactual nature)
Standard solution is A/B test… but
expensive
Speller as Contextual Bandit
S
U
• U
context
• S ...
truly randomized data) • Cheap, and potentially useful • But risky (by ignoring potential confounding) • Need to design properly before collecting data
How to Design Exploration Distributions (2) • depends ...
doi:10.1145/2684822.2697040
dblp:conf/wsdm/Li15
fatcat:2ap6hcpimfh5xogmdzif6ar6ri
Ensemble contextual bandits for personalized recommendation
2014
Proceedings of the 8th ACM Conference on Recommender systems - RecSys '14
In this paper, we explore ensemble strategies of contextual bandit algorithms to obtain robust predicted click-through rate (CTR) of web objects. ...
However, due to high-dimensional user/item features and the underlying characteristics of bandit policies, it is often difficult for service providers to obtain and deploy an appropriate algorithm to achieve ...
ALGORITHM This section presents two ensemble bandit algorithms, Hy-perTS and HyperTSFB, for solving the contextual recommendation problem in the cold-start situation. ...
doi:10.1145/2645710.2645732
dblp:conf/recsys/TangJLL14
fatcat:flmgmd3zqjcbvbymhzmha23tfa
A contextual-bandit approach to personalized news article recommendation
2010
Proceedings of the 19th international conference on World wide web - WWW '10
In this work, we model personalized recommendation of news articles as a contextual bandit problem, a principled approach in which a learning algorithm sequentially selects articles to serve users based ...
First, we propose a new, general contextual bandit algorithm that is computationally efficient and well motivated from learning theory. ...
CONCLUSIONS This paper takes a contextual-bandit approach to personalized web-based services such as news article recommendation. ...
doi:10.1145/1772690.1772758
dblp:conf/www/LiCLS10
fatcat:76lesokjgzc7vkl4sw5xr44ddy
Personalized Recommendation via Parameter-Free Contextual Bandits
2015
Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR '15
In this paper, we formulate personalized recommendation as a contextual bandit problem to solve the exploration/exploitation dilemma. ...
click-through rate. ...
The reward is the user response (e.g., a click). Therefore, personalized recommendation can be seen as an instance of the contextual bandit problem. ...
doi:10.1145/2766462.2767707
dblp:conf/sigir/TangJLZL15
fatcat:gyuie4kilngm5draxc7tivnwlq
A Map of Bandits for E-commerce
[article]
2021
arXiv
pre-print
The rich body of Bandit literature not only offers a diverse toolbox of algorithms, but also makes it hard for a practitioner to find the right solution to solve the problem at hand. ...
In this paper, we aim to reduce this gap with a structured map of Bandits to help practitioners navigate to find relevant and practical Bandit algorithms. ...
The reward depends on and . The objective of a Bandit algorithm is to recommend actions for each step to maximize the expected cumulative reward over time: =1 [ ] where is the total number of steps. ...
arXiv:2107.00680v1
fatcat:7gl37h4yrrbfhdy4q5eyk7usbq
Accelerated learning from recommender systems using multi-armed bandit
[article]
2019
arXiv
pre-print
The gold standard for evaluating recommendation algorithms has been the A/B test since it is an unbiased way to estimate how well one or more algorithms compare in the real world. ...
We argue that multi armed bandit (MAB) testing as a solution to these issues. ...
ACKNOWLEDGMENTS The authors would like to thank Travis Brady, Pavlos Mitsoulis Ntompos, Ben Dundee, Kurt Smith, and John Meakin for their internal review of this paper and their helpful feedback. ...
arXiv:1908.06158v1
fatcat:7rp3l5ea25feliymdm6cyeuska
Query Completion Using Bandits for Engines Aggregation
[article]
2017
arXiv
pre-print
We tackle this problem under the bandits setting and evaluate four strategies to overcome this challenge. ...
There are many possible strategies for query auto-completion and a challenge is to design one optimal engine that considers and uses all available information. ...
Thanks to Coveo for providing the data and the infrastructure and to the Natural Sciences and Engineering Research Council of Canada (NSERC) for the research grant EGP 492531-15. ...
arXiv:1709.04095v1
fatcat:n4skx7ld2zcxlpxmv6vg3mddry
Reinforcement Learning for Online Information Seeking
[article]
2019
arXiv
pre-print
In this paper, we give an overview of deep reinforcement learning for search, recommendation, and online advertising from methodologies to applications, review representative algorithms, and discuss some ...
Search, recommendation, and online advertising are the three most important information-providing mechanisms on the web. ...
sequentially for specific users based on the users' and articles' contextual information, in order to maximize the total user clicks. ...
arXiv:1812.07127v4
fatcat:pyc75g5hufcs5b3f75gonbkp24
Introduction to Multi-Armed Bandits
2019
Foundations and Trends® in Machine Learning
Reward (e.g.) medical trials which drug to prescribe healthy/not. web design font color or page layout #clicks. web content items/articles to emphasize #clicks. web search search results given a query ...
., depends only on the chosen arm), but not known to the algorithm. ...
doi:10.1561/2200000068
fatcat:5drse7hks5fuzd6hriwrlgp27a
Contextual Bandit Applications in Customer Support Bot
[article]
2021
arXiv
pre-print
It includes intent disambiguation based on neural-linear bandits (NLB) and contextual recommendations based on a collection of multi-armed bandits (MAB). ...
Adaptable learning techniques, like contextual bandits, are a natural fit for this problem setting. ...
Lessons from Contextual Bandit Learning in a
tion and contextual recommendations for a virtual support agent. ...
arXiv:2112.03210v1
fatcat:m4eig5htorge7oko3xgkfs3ffe
DCM Bandits: Learning to Rank with Multiple Clicks
[article]
2016
arXiv
pre-print
We propose DCM bandits, an online learning variant of the DCM where the goal is to maximize the probability of recommending satisfactory items, such as web pages. ...
This work presents the first practical and regret-optimal online algorithm for learning to rank with multiple clicks in a cascade-like click model. ...
DCM Bandits We propose a learning variant of the dependent click model (Section 3.1) and a computationally-efficient algorithm for solving it (Section 3.3). ...
arXiv:1602.03146v2
fatcat:yixxwteyn5ha5ayfugrf62juta
« Previous
Showing results 1 — 15 out of 372 results