Filters








536 Hits in 3.9 sec

Contextual Multi-Armed Bandits for Causal Marketing [article]

Neela Sawant, Chitti Babu Namballa, Narayanan Sadagopan, Houssam Nassif
2018 arXiv   pre-print
This work explores the idea of a causal contextual multi-armed bandit approach to automated marketing, where we estimate and optimize the causal (incremental) effects.  ...  Our approach draws on strengths of causal inference, uplift modeling, and multi-armed bandits.  ...  Algorithm 2 Thompson Sampling based Contextual Multi-Armed Bandits with Online Scoring and Batch TrainingInitialization: Time t = 0; event log L = {}; d- dimensional bandit arm contextual distribution  ... 
arXiv:1810.01859v1 fatcat:ixqhpj2wkzextkxq2d4oj7jdgy

Treatment effect optimisation in dynamic environments

Jeroen Berrevoets, Sam Verboven, Wouter Verbeke
2022 Journal of Causal Inference  
Incorporating this target creates a causal model which we name an uplifted contextual multi-armed bandit.  ...  Applying causal methods to fields such as healthcare, marketing, and economics receives increasing interest.  ...  Acknowledgments: We wish to thank Vincent Ginis, Ann Nowé, Judea Pearl and our reviewers for their helpful comments. Funding information: JB is funded by the W.D. Armstrong Trust Fund.  ... 
doi:10.1515/jci-2020-0009 fatcat:tyt3pmevgbhl5iduomcz54eszu

Optimising Individual-Treatment-Effect Using Bandits [article]

Jeroen Berrevoets, Sam Verboven, Wouter Verbeke
2019 arXiv   pre-print
To counter this, we propose the uplifted contextual multi-armed bandit (U-CMAB), a novel approach to optimise the ITE by drawing upon bandit literature.  ...  Take for example the negative influence on a marketing campaign when a competitor product is released.  ...  Contextual multi-armed bandits (CMAB) differ from UM as they apply treatment in function of expected response only.  ... 
arXiv:1910.07265v1 fatcat:s2w2pqv55beqrp6rvtvwbsla64

Bandit Algorithms for Precision Medicine [article]

Yangyi Lu, Ziping Xu, Ambuj Tewari
2021 arXiv   pre-print
Since precision medicine focuses on the use of patient characteristics to guide treatment, contextual bandit algorithms are especially useful since they are designed to take such information into account  ...  The Oxford English Dictionary defines precision medicine as "medical care designed to optimize efficiency or therapeutic benefit for particular groups of patients, especially by using genetic or molecular  ...  Multi-armed Bandit In recent years, the multi-armed bandit (MAB) framework has attracted a lot of attention in many application areas such as healthcare, marketing, and recommendation systems.  ... 
arXiv:2108.04782v1 fatcat:dni5wyzyerestgs3upuzz776n4

A Causal Approach to Prescriptive Process Monitoring

Zahra Dasht Bozorgi
2021 International Conference on Business Process Management  
Contextual bandits are an extension of multi-armed bandits. They output an action conditional on the state of the environment.  ...  In particular, contextual bandits algorithms have benefited from the causal inference literature to make them less prone to problems in estimation bias [15] .  ... 
dblp:conf/bpm/Bozorgi21 fatcat:lvxhrpdxdjglzavhhbh3rrsvxq

A Map of Bandits for E-commerce [article]

Yi Liu, Lihong Li
2021 arXiv   pre-print
The rich body of Bandit literature not only offers a diverse toolbox of algorithms, but also makes it hard for a practitioner to find the right solution to solve the problem at hand.  ...  In this paper, we aim to reduce this gap with a structured map of Bandits to help practitioners navigate to find relevant and practical Bandit algorithms.  ...  Best-arm Identification In some bandit applications, our goal is not to maximize reward during an experiment, but to identify the best action (e.g., best marketing campaign strategy) at the end of the  ... 
arXiv:2107.00680v1 fatcat:7gl37h4yrrbfhdy4q5eyk7usbq

VacSIM: Learning Effective Strategies for COVID-19 Vaccine Distribution using Reinforcement Learning [article]

Raghav Awasthi, Keerat Kaur Guliani, Saif Ahmad Khan, Aniket Vashishtha, Mehrab Singh Gill, Arshita Bhatt, Aditya Nagori, Aniket Gupta, Ponnurangam Kumaraguru, Tavpritesh Sethi
2021 arXiv   pre-print
We approach this problem by proposing a novel pipeline VacSIM that dovetails Deep Reinforcement Learning models into a Contextual Bandits approach for optimizing the distribution of COVID-19 vaccine.  ...  Whereas the Reinforcement Learning models suggest better actions and rewards, Contextual Bandits allow online modifications that may need to be implemented on a day-to-day basis in the real world scenario  ...  Tavpritesh Sethi and the Center for Artificial Intelligence at IIIT-Delhi.  ... 
arXiv:2009.06602v3 fatcat:2yfa3xapyna5lnlryuq6237uae

Multi-armed bandit experiments in the online service economy

Steven L. Scott
2015 Applied Stochastic Models in Business and Industry  
This article briefly summarizes mulit-armed bandit experiments, where the experimental design is modified as the experiment progresses to reduce the cost of experimenting.  ...  Contextual information The multi-armed bandit can be sensitive to the assumed model for the rewards distribution.  ...  Multi-armed bandit experiments A multi-armed bandit is a sequential experiment where the goal is to produce the largest reward. In the typical setup there are K actions or "arms."  ... 
doi:10.1002/asmb.2104 fatcat:c23qh6fznfddhimr2he7qheyta

AutoML for Contextual Bandits [article]

Praneet Dutta, Joe Cheuk, Jonathan S Kim, Massimo Mascaro
2022 arXiv   pre-print
Contextual Bandits is one of the widely popular techniques used in applications such as personalization, recommendation systems, mobile health, causal marketing etc .  ...  We propose an end to end automated meta-learning pipeline to approximate the optimal Q function for contextual bandits problems.  ...  It is an extension of the multi-armed bandit problem [14] , generalizing it with the concept of a context.  ... 
arXiv:1909.03212v2 fatcat:scixdaefijbupj4nw7wcdsesva

Uplift Modeling for Multiple Treatments with Cost Optimization [article]

Zhenyu Zhao, Totte Harinen
2020 arXiv   pre-print
It can be used for optimizing the performance of interventions such as marketing campaigns and product designs.  ...  An important but so far neglected use case for uplift modeling is an experiment with multiple treatment groups that have different costs, such as for example when different communication channels and promotion  ...  for collaboration on use cases.  ... 
arXiv:1908.05372v3 fatcat:h4v2daihonejpnzo7fs6ppnp3e

Rate-Optimal Contextual Online Matching Bandit [article]

Yuantong Li, Chi-hua Wang, Guang Cheng, Will Wei Sun
2022 arXiv   pre-print
Existing works focus on multi-armed bandit with static preference, but this is insufficient: the two-sided preference changes as along as one-side's contextual information updates, resulting in non-static  ...  This motivates us to consider a novel Contextual Online Matching Bandit prOblem (COMBO), which allows dynamic preferences in matching decisions.  ...  However, these works do not consider the arms' contextual information and hence are not capable of tackling our dynamic matching problem. Centralized Multi-Agent Bandit for Matching.  ... 
arXiv:2205.03699v1 fatcat:zn5e42g3dnhgljyup4k6l5gk3a

Reinforcement Learning in Practice: Opportunities and Challenges [article]

Yuxi Li
2022 arXiv   pre-print
Then we discuss opportunities of RL, in particular, products and services, games, bandits, recommender systems, robotics, transportation, finance and economics, healthcare, education, combinatorial optimization  ...  These are developed in the settings of multi-armed bandits, but are applicable to RL problems. We discuss bandits in Section 3.3.  ...  Contextual bandits are a "mature" technique that can be widely applied. RL is "mature" for many single-, two-, and multi-player games.  ... 
arXiv:2202.11296v2 fatcat:xdtsmme22rfpfn6rgfotcspnhy

Optimizing peer referrals for public awareness using contextual bandits

Ramaravind Kommiya Mothilal, Amulya Yadav, Amit Sharma
2019 Proceedings of the Conference on Computing & Sustainable Societies - COMPASS '19  
Given the lack of initial information about the social network or how people respond to referral incentives, we use an explore-exploit strategy and present a contextual bandit agent CoBBI that optimizes  ...  With a fixed budget for referral incentives, a natural goal for such referral programs is to maximize the number of people reached.  ...  This strategy, known as an ϵ-greedy multi-armed bandit, works well for a variety of decision optimization problems [16] .  ... 
doi:10.1145/3314344.3332497 dblp:conf/dev/MothilalYS19 fatcat:e3n2fjkttveodp67ji3kj6ve3a

Factored Policy Gradients: Leveraging Structure for Efficient Learning in MOMDPs [article]

Thomas Spooner, Nelson Vadori, Sumitra Ganesh
2021 arXiv   pre-print
Finally, we demonstrate the performance advantages of our algorithm on large-scale bandit and traffic intersection problems, providing a novel contribution to the latter in the form of a spatial approximation  ...  Factored policy gradients (FPGs), which follow, provide a common framework for analysing key state-of-the-art algorithms, are shown to generalise traditional policy gradients, and yield a principled way  ...  Acknowledgments and Disclosure of Funding The authors would like to acknowledge our colleagues Joshua Lockhart, Jason Long and Rui Silva for their input and suggestions at various key stages of the research  ... 
arXiv:2102.10362v3 fatcat:l7vtjc7kanfk7dgxnp2c3jc4oq

Efficient Counterfactual Learning from Bandit Feedback

Yusuke Narita, Shota Yasui, Kohei Yata
2019 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
For log data generated by contextual bandit algorithms, we consider offline estimators for the expected reward from a counterfactual policy.  ...  What is the most statistically efficient way to do off-policy optimization with batch data from bandit feedback?  ...  We are grateful to seminar participants at ICML/IJCAI/AAMAS Workshop "Machine Learning for Causal Inference, Counterfactual Prediction, and Autonomous Action (CausalML)" and RIKEN Center for Advanced Intelligence  ... 
doi:10.1609/aaai.v33i01.33014634 fatcat:lyqkmh3t45ailcpmkyeczlfaiu
« Previous Showing results 1 — 15 out of 536 results