246 Hits in 5.5 sec

Multi-armed recommendation bandits for selecting state machine policies for robotic systems

Pyry Matikainen, P. Michael Furlong, Rahul Sukthankar, Martial Hebert
2013 2013 IEEE International Conference on Robotics and Automation  
By borrowing concepts from collaborative filtering (recommender systems such as Netflix and, we present a multi-armed bandit formulation that incorporates recommendation techniques to efficiently  ...  We show that this formulation outperforms the individual approaches (recommendation, multi-armed bandits) as well as the baseline of selecting the 'average best' state machine across all rooms.  ...  Government is authorized to reproduce and distribute reprints for government purposes notwithstanding any copyright notation herein. We thank Shumeet Baluja for his helpful comments on the paper.  ... 
doi:10.1109/icra.2013.6631223 dblp:conf/icra/MatikainenFSH13 fatcat:q7yszwrh3rbe3f3igvl3ksfzsq

A multi-armed bandit approach for exploring partially observed networks

Kaushalya Madhawa, Tsuyoshi Murata
2019 Applied Network Science  
We formulate this problem as an exploration-exploitation problem and propose a novel nonparametric multi-armed bandit (MAB) algorithm for identifying which nodes to be queried.  ...  Conclusions: Our results demonstrate that multi-armed bandit based algorithms are well suited for exploring partially observed networks compared to heuristic based algorithms.  ...  Funding This work was supported by JSPS Grant-in-Aid for Scientific Research(B) (Grant Number 17H01785) and JST CREST (Grant Number JPMJCR1687).  ... 
doi:10.1007/s41109-019-0145-0 fatcat:u6hhlj3d7vbq7jx4247jjripqe

An embedded bandit algorithm based on agent evolution for cold-start problem

Rui Qiu, Wen Ji
2021 International Journal of Crowd Science  
Findings The authors introduce Word2Vec technique for constructing user contextual features, which improved the accuracy of recommendations compared to traditional multi-armed bandit problem.  ...  Purpose Many recommender systems are generally unable to provide accurate recommendations to users with limited interaction history, which is known as the cold-start problem.  ...  Classic multi-armed bandit algorithm The basic framework of classic MAB algorithm can be formulated as follows.  ... 
doi:10.1108/ijcs-03-2021-0005 fatcat:ttifdzxunfbcnplmwpmmkfijoe

Exploiting correlation and budget constraints in Bayesian multi-armed bandit optimization [article]

Matthew W. Hoffman, Bobak Shahriari, Nando de Freitas
2013 arXiv   pre-print
This problem is also known as fixed-budget best arm identification in the multi-armed bandit literature.  ...  We introduce a Bayesian approach for this problem and show that it empirically outperforms both the existing frequentist counterpart and other Bayesian optimization methods.  ...  The exploration phase consists of T rounds wherein a decision maker interacts with the bandit process by sampling arms.  ... 
arXiv:1303.6746v4 fatcat:ovreptonz5amfbvz7jzw2iwhae

Generic Outlier Detection in Multi-Armed Bandit [article]

Yikun Ban, Jingrui He
2020 arXiv   pre-print
For this problem, a learner aims to identify the arms whose expected rewards deviate significantly from most of the other arms.  ...  In this paper, we study the problem of outlier arm detection in multi-armed bandit settings, which finds plenty of applications in many high-impact domains such as finance, healthcare, and online advertising  ...  For example, a score of an outlier is measured by its distance to its k-th nearest neighbor, called k-th Nearest Neighbor [11] .  ... 
arXiv:2007.07293v1 fatcat:bl5yvngzqva5zju5ghsktnoesi

Contextual Multi-Armed Bandits for Causal Marketing [article]

Neela Sawant, Chitti Babu Namballa, Narayanan Sadagopan, Houssam Nassif
2018 arXiv   pre-print
This work explores the idea of a causal contextual multi-armed bandit approach to automated marketing, where we estimate and optimize the causal (incremental) effects.  ...  Our approach draws on strengths of causal inference, uplift modeling, and multi-armed bandits.  ...  More formally, let Φ(X) be a context matching algorithm that finds similar customers to X, like K-Nearest Neighbor, locality sensitive hashing, or propensity matching.  ... 
arXiv:1810.01859v1 fatcat:ixqhpj2wkzextkxq2d4oj7jdgy

The Sample Complexity of Online One-Class Collaborative Filtering [article]

Reinhard Heckel, Kannan Ramchandran
2017 arXiv   pre-print
Both questions arise in the design of recommender systems. We introduce a simple probabilistic user model, and analyze the performance of an online user-based CF algorithm.  ...  We prove that after an initial cold start phase, where recommendations are invested in exploring the user's preferences, this algorithm makes---up to a fraction of the recommendations required for updating  ...  Specifically, in this variant of the multi-armed bandit problem, the arms are grouped into clusters, and the arms within each cluster are dependent.  ... 
arXiv:1706.00061v1 fatcat:eqm4qshyjnayrmcfj52wofhbjy

Ultra Fast Medoid Identification via Correlated Sequential Halving [article]

Tavor Z. Baharav, David N. Tse
2019 arXiv   pre-print
multi-armed bandit algorithm.  ...  The resulting randomized algorithm is obtained by a direct conversion of the computation problem to a multi-armed bandit statistical inference problem.  ...  Acknowledgements The authors gratefully acknowledge funding from the NSF GRFP, Alcatel-Lucent Stanford Graduate Fellowship, NSF grant under CCF-1563098, and the Center for Science of Information (CSoI)  ... 
arXiv:1906.04356v2 fatcat:3vq7mc3yovbq7m4wufd7ckyv6m

Machine Learning Paradigms for Next-Generation Wireless Networks

Chunxiao Jiang, Haijun Zhang, Yong Ren, Zhu Han, Kwang-Cheng Chen, Lajos Hanzo
2017 IEEE wireless communications  
Next-generation wireless networks are expected to support extremely high data rates and radically new applications, which require a new wireless radio technology paradigm.  ...  Actions muLtI-Armed bAndIts: devIce-to-devIce netWorKs Models: In practice, multi-armed bandits (MAB) have been used to model resource allocation problems operating under a fixed budget by carefully  ...  Femto and small cells [14, 15] Multi-armed bandit • Exploration vs. exploitation • Multi-armed bandit game Device-to-device networks [16] ciation under the unknown energy status of the base stations  ... 
doi:10.1109/mwc.2016.1500356wc fatcat:qyg65wlf5verhefwnzfdzmhopu

From Ads to Interventions: Contextual Bandits in Mobile Health [chapter]

Ambuj Tewari, Susan A. Murphy
2017 Mobile Health  
We have now come full circle because contextual bandits provide a natural framework for sequential decision making in mobile health.  ...  We will survey the contextual bandits literature with a focus on modifications needed to adapt existing approaches to the mobile health setting.  ...  There is some work on risk-aversion in multi-armed bandit problems [58, 59] .  ... 
doi:10.1007/978-3-319-51394-2_25 fatcat:zjto7w26v5fpvcbhpvlo5v7kha

Efficient Retrieval of Matrix Factorization-Based Top-k Recommendations: A Survey of Recent Approaches

Dung D. Le, Hady Lauw
2021 The Journal of Artificial Intelligence Research  
However, for the recommendation retrieval phase, naively scanning a large number of items to identify the few most relevant ones may inhibit truly real-time applications.  ...  Top-k recommendation seeks to deliver a personalized list of k items to each individual user.  ...  NSW (Malkov & Yashunin, 2019) is proposed to take advantage of the Delaunay Graph, the NSWN, and the Relative Neighborhood Graphs, enabling multi-scale hopping on different layers of the graph.  ... 
doi:10.1613/jair.1.12403 fatcat:fpum5xffmbhclme3hdmmbs34uy

Multi-Objective Model Selection via Racing

Tiantian Zhang, Michael Georgiopoulos, Georgios C. Anagnostopoulos
2016 IEEE Transactions on Cybernetics  
Recent attempts along this line include metaheuristics optimization, local search-based approaches, sequential model-based methods, portfolio algorithm approaches, and multi-armed bandits.  ...  There are many practical applications of MS, such as model parameter tuning, personalized recommendations, A/B testing, etc.  ...  Multi-Armed Bandit Multi-Armed Bandit (MAB) problems refer to a wide variety of sequential allocation problems with an exploration-exploitation tradeoff.  ... 
doi:10.1109/tcyb.2015.2456187 pmid:26277013 fatcat:mvjz67vfk5gzdmz3jclakxpxhi

Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks [article]

Jingjing Wang and Chunxiao Jiang and Haijun Zhang and Yong Ren and Kwang-Cheng Chen and Lajos Hanzo
2020 arXiv   pre-print
Future wireless networks have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate  ...  Machine learning (ML) algorithms have great success in supporting big data analytics, efficient parameter estimation and interactive decision making.  ...  Multi-Armed Bandit and Its Applications 1) Methods: The multi-armed bandit technique, also called K-armed bandit, models a decision making problem, where an agent is faced with a dilemma of K different  ... 
arXiv:1902.01946v2 fatcat:7bveg6rmjfga5mftdkr3mst2qa

A Text-based Deep Reinforcement Learning Framework for Interactive Recommendation [article]

Chaoyang Wang and Zhiqiang Guo and Jianjun Li and Peng Pan and Guohui Li
2020 arXiv   pre-print
Due to its nature of learning from dynamic interactions and planning for long-run performance, reinforcement learning (RL) recently has received much attention in interactive recommender systems (IRSs)  ...  To address these two problems, in this paper, we propose a Text-based Deep Deterministic Policy Gradient framework (TDDPG-Rec) for IRSs.  ...  ACKNOWLEDGEMENTS We would like to thank the referees for their valuable comments, which helped improve this paper considerably. The work was partially supported by the National Natural Science  ... 
arXiv:2004.06651v4 fatcat:57f2nh7aj5hqdpcxcyncvhlgee

Online Learning: A Comprehensive Survey [article]

Steven C.H. Hoi, Doyen Sahoo, Jing Lu, Peilin Zhao
2018 arXiv   pre-print
from a sequence of data instances one at a time.  ...  This survey aims to provide a comprehensive survey of the online machine learning literatures through a systematic review of basic ideas and key principles and a proper categorization of different algorithms  ...  Bandit learning tasks: Bandit online learning algorithms, also known as Multi-armed bandits (MAB), have been extensively used for many online recommender systems, such as online advertising for internet  ... 
arXiv:1802.02871v2 fatcat:mqorsb4gknhfhjfb4jcsvbrtwm
« Previous Showing results 1 — 15 out of 246 results