Filters








518 Hits in 3.0 sec

Ensemble contextual bandits for personalized recommendation

Liang Tang, Yexi Jiang, Lei Li, Tao Li
2014 Proceedings of the 8th ACM Conference on Recommender systems - RecSys '14  
In this paper, we explore ensemble strategies of contextual bandit algorithms to obtain robust predicted click-through rate (CTR) of web objects.  ...  The ensemble is acquired by aggregating different pulling policies of bandit algorithms, rather than forcing the agreement of prediction results or learning a unified predictive model.  ...  • We propose two ensemble strategies to address the cold-start problem in personalized recommender systems, which employ a meta-bandit learning paradigm to achieve the robustness of the CTR. • We conduct  ... 
doi:10.1145/2645710.2645732 dblp:conf/recsys/TangJLL14 fatcat:flmgmd3zqjcbvbymhzmha23tfa

Accelerated learning from recommender systems using multi-armed bandit [article]

Meisam Hejazinia, Kyler Eastman, Shuqin Ye, Abbas Amirabadi, Ravi Divvela
2019 arXiv   pre-print
We argue that multi armed bandit (MAB) testing as a solution to these issues.  ...  Recommendation systems are a vital component of many online marketplaces, where there are often millions of items to potentially present to users who have a wide variety of wants or needs.  ...  Another study [13] discusses ensembling the content based and collaborative filtering based recommendations, using multi-armed bandits.  ... 
arXiv:1908.06158v1 fatcat:7rp3l5ea25feliymdm6cyeuska

Interactive Social Recommendation

Xin Wang, Steven C.H. Hoi, Chenghao Liu, Martin Ester
2017 Proceedings of the 2017 ACM on Conference on Information and Knowledge Management - CIKM '17  
manner from a collection of training data which are accumulated from users historical interactions with the recommender systems.  ...  In the real world, new users may leave the systems for the reason of being recommended with boring items before enough data is collected for training a good model, which results in an ine cient customer  ...  In this section, we will give a mathematical description of the general idea for multi-armed bandit (MAB) strategy in the context of recommender systems, as well as several existing multi-armed bandit  ... 
doi:10.1145/3132847.3132880 dblp:conf/cikm/WangHLE17 fatcat:l4xwvhl67nhs7djignsv5obrne

Online Context-Aware Recommendation with Time Varying Multi-Armed Bandit

Chunqiu Zeng, Qing Wang, Shekoofeh Mokhtari, Tao Li
2016 Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '16  
Contextual multi-armed bandit problems have gained increasing popularity and attention in recent years due to their capability of leveraging contextual information to deliver online personalized recommendation  ...  To predict the reward of each arm given a particular context, existing relevant research studies for contextual multi-armed bandit problems often assume the existence of a fixed yet unknown reward mapping  ...  PROBLEM FORMULATION In this section, we formally define the contextual multi-armed bandit problem first, and then model the time varying contextual multi-armed bandit problem.  ... 
doi:10.1145/2939672.2939878 dblp:conf/kdd/ZengWML16 fatcat:r2r3c54nbjeqtlbajymtkpnb6y

Dynamic Ensemble Active Learning: A Non-Stationary Bandit with Expert Advice

Kunkun Pang, Mingzhi Dong, Yang Wu, Timothy M. Hospedales
2018 2018 24th International Conference on Pattern Recognition (ICPR)  
We develop a dynamic ensemble active learner based on a non-stationary multi-armed bandit with expert advice algorithm.  ...  This has motivated research into ensembles of active learners that learn what constitutes a good criteria in a given scenario, typically via multi-armed bandit algorithms.  ...  Bandit Algorithms Multi-armed Bandit In multi-armed bandit (MAB) problems, a player pulls a lever from a set K = {1, . . . , K} of slot machines in a sequence of time steps T = {1, . . . , T } to maximise  ... 
doi:10.1109/icpr.2018.8545422 dblp:conf/icpr/PangDWH18 fatcat:p5vz2ehjgza6neial27pub2qbi

Dynamic Ensemble Active Learning: A Non-Stationary Bandit with Expert Advice [article]

Kunkun Pang, Mingzhi Dong, Yang Wu, Timothy M. Hospedales
2018 arXiv   pre-print
We develop a dynamic ensemble active learner based on a non-stationary multi-armed bandit with expert advice algorithm.  ...  This has motivated research into ensembles of active learners that learn what constitutes a good criteria in a given scenario, typically via multi-armed bandit algorithms.  ...  Bandit Algorithms Multi-armed Bandit In multi-armed bandit (MAB) problems, a player pulls a lever from a set K = {1, . . . , K} of slot machines in a sequence of time steps T = {1, . . . , T } to maximise  ... 
arXiv:1810.07778v1 fatcat:tvahebohmraahhqwc6jgmaeffy

Cluster Based Deep Contextual Reinforcement Learning for top-k Recommendations [article]

Anubha Kabra, Anu Agarwal, Anil Singh Parihar
2020 arXiv   pre-print
Rapid advancements in the E-commerce sector over the last few decades have led to an imminent need for personalised, efficient and dynamic recommendation systems.  ...  To sufficiently cater to this need, we propose a novel method for generating top-k recommendations by creating an ensemble of clustering with reinforcement learning.  ...  Our proposed work combines contextual Multi armed bandits strategy with DB-SCAN clustering Dueling Bandits exploration strategy to provide highly personalised recommendations in a more efficient manner  ... 
arXiv:2012.02291v1 fatcat:ak7qbf6ngve3hlz6ozuvz3wq4y

Deep neural network marketplace recommenders in online experiments [article]

Simen Eide, Ning Zhou
2018 arXiv   pre-print
models combining features from user engagement and content, sequence-based models, and multi-armed bandit models that optimize user engagement by re-ranking proposals from multiple submodels.  ...  This paper focuses on the challenge of measuring recommender performance and summarizes the online experiment results with several promising types of deep neural network recommenders - hybrid item representation  ...  - hybrid item representation, sequence-based model, multi-armed The multi-armed bandit models were tested in two groups with bandit, and analyze their performance in online experiments  ... 
arXiv:1809.02130v1 fatcat:hbe62jvfhvcp5i3bfvmaoulgoa

A Multi-Armed Bandit Model Selection for Cold-Start User Recommendation

Crícia Z. Felício, Klérisson V.R. Paixão, Celia A.Z. Barcelos, Philippe Preux
2017 Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization - UMAP '17  
Assuming that a number of alternative prediction models is available to select items to recommend to a cold user, this paper introduces a multi-armed bandit based model selection, named PdMS.  ...  Recommender systems usually keep a substantial amount of prediction models that are available for analysis. Moreover, recommendations to new users yield uncertain returns.  ...  already tuned model available in the system [20] . e second insight is to consider our goal as a multi-armed bandit (MAB) problem [3] .  ... 
doi:10.1145/3079628.3079681 dblp:conf/um/FelicioPBP17 fatcat:uh2qxrk2kbcgxf5e5z47htc3mq

Data Efficient Training for Reinforcement Learning with Adaptive Behavior Policy Sharing [article]

Ge Liu, Rui Wu, Heng-Tze Cheng, Jing Wang, Jayden Ooi, Lihong Li, Ang Li, Wai Lok Sibon Li, Craig Boutilier, Ed Chi
2020 arXiv   pre-print
However, training deep RL model is challenging in real world applications such as production-scale health-care or recommender systems because of the expensiveness of interaction and limitation of budget  ...  Adaptive Behavior Policy Sharing (ABPS), a data-efficient training algorithm that allows sharing of experience collected by behavior policy that is adaptively selected from a pool of agents trained with an ensemble  ...  1 Introduction health-care or recommender system due to some practical constraints.  ... 
arXiv:2002.05229v1 fatcat:uqob4xze3bhdpiqczlmxq3g3ri

The Use of Bandit Algorithms in Intelligent Interactive Recommender Systems [article]

Qing Wang
2021 arXiv   pre-print
Multi-armed bandit algorithms, which have been widely applied into various online systems, are quite capable of delivering such efficient recommendation services.  ...  However, few existing bandit models are able to adapt to new changes introduced by the modern recommender systems.  ...  Contextual Multi-armed Bandit Interactive recommender systems play an essential role in our daily life due to the abundance of online services [37] in this information age.  ... 
arXiv:2107.00161v1 fatcat:dv2s3ezfmjazxpeekhskalwh5u

A Novel Approach to Address External Validity Issues in Fault Prediction Using Bandit Algorithms

Teruki HAYAKAWA, Masateru TSUNODA, Koji TODA, Keitaro NAKASAI, Amjed TAHIR, Kwabena Ebo BENNIN, Akito MONDEN, Kenichi MATSUMOTO
2021 IEICE transactions on information and systems  
Our results showed that bandit algorithms can provide promising outcomes when used in fault prediction. key words: defect prediction, multi-armed bandit, diversity of datasets, dynamic model selection,  ...  In this work, we propose the use of bandit algorithms in cases where the accuracy of the models are inconsistent across multiple datasets.  ...  Multi-armed bandit problem is to seek sequentially best candidates (they are referred to as arms) whose expected rewards are unknown, to maximize total rewards.  ... 
doi:10.1587/transinf.2020edl8098 fatcat:55lig2j3xfbr3lszqrx3bh6udq

Personalized Advertisement Recommendation: A Ranking Approach to Address the Ubiquitous Click Sparsity Problem [article]

Sougata Chaudhuri and Georgios Theocharous and Mohammad Ghavamzadeh
2016 arXiv   pre-print
The system uses an internal ad recommendation policy to map the user's profile (context) to one of the ads.  ...  We study the problem of personalized advertisement recommendation (PAR), which consist of a user visiting a system (website) and the system displaying one of K ads to the user.  ...  Introduction Personalized advertisement recommendation (PAR) system is intrinsic to many major tech companies like Google, Yahoo, Facebook and others.  ... 
arXiv:1603.01870v1 fatcat:ad4xni2zkvevjah6pe7j2trwpy

Personalized Recommendation via Parameter-Free Contextual Bandits

Liang Tang, Yexi Jiang, Lei Li, Chunqiu Zeng, Tao Li
2015 Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR '15  
Most online recommender systems try to address the information needs of users by virtue of both user and content information.  ...  In this paper, we formulate personalized recommendation as a contextual bandit problem to solve the exploration/exploitation dilemma.  ...  Multi-armed bandit problem: The primary challenge in multi-armed bandit problems is to balance the tradeoff between exploration and exploitation.  ... 
doi:10.1145/2766462.2767707 dblp:conf/sigir/TangJLZL15 fatcat:gyuie4kilngm5draxc7tivnwlq

Forecasting the nearly unforecastable: why aren't airline bookings adhering to the prediction algorithm?

Saravanan Thirumuruganathan, Soon-gyo Jung, Dianne Ramirez Robillos, Joni Salminen, Bernard J. Jansen
2021 Electronic Commerce Research  
A unique aspect of the model is the incorporation of self-competence, where the model defers when it cannot reasonably make a recommendation.  ...  , rule-based recommenders, and bandit-based recommenders.  ...  . • Multi-class classifier engine Classifiers are known to be better able to use features than the recommender system and can address the cold-start issue for new customers [45] and, in our case, very  ... 
doi:10.1007/s10660-021-09457-0 fatcat:5zin2rli5nfnnoetq4fwjg3klq
« Previous Showing results 1 — 15 out of 518 results