2,601 Hits in 5.1 sec

A Novel Confidence-Based Algorithm for Structured Bandits [article]

Andrea Tirinzoni, Alessandro Lazaric, Marcello Restelli
2020 arXiv   pre-print
We introduce a novel phased algorithm that exploits the given structure to build confidence sets over the parameters of the true bandit problem and rapidly discard all sub-optimal arms.  ...  In particular, unlike standard bandit algorithms with no structure, we show that the number of times a suboptimal arm is selected may actually be reduced thanks to the information collected by pulling  ...  In this paper, we focus on the widely-applied confidence-based strategies for structured bandits. Our contributions are as follows. 1) We propose an algorithm running through phases.  ... 
arXiv:2005.11593v1 fatcat:3dqqmxsg2bfy5isgx3ifikkhxe

Differentiable Linear Bandit Algorithm [article]

Kaige Yang, Laura Toni
2020 arXiv   pre-print
Specifically, noting that existing UCB-typed algorithms are not differentiable with respect to confidence bound, we first propose a novel differentiable linear bandit algorithm.  ...  Upper Confidence Bound (UCB) is arguably the most commonly used method for linear multi-arm bandit problems.  ...  Conclusion We propose SoftUCB, a novel UCB-typed linear bandit algorithm based on an adaptive confidence bound, resulting in a less conservative algorithm respect to UCB-typed algorithms with constructed  ... 
arXiv:2006.03000v1 fatcat:lyg257yoenbphm55chqqxc32ai

Improved Regret Bounds of Bilinear Bandits using Action Space Analysis

Kyoungseok Jang, Kwang-Sung Jun, Se-Young Yun, Wanmo Kang
2021 International Conference on Machine Learning  
We consider the bilinear bandit problem where the learner chooses a pair of arms, each from two different action spaces of dimension d 1 and d 2 , respectively.  ...  Second, we additionally devise an algorithm with better empirical performance than previous algorithms.  ...  The design of rO-UCB is based on our novel adaptive design of confidence bound for low-rank matrices that can be used beyond rank-one measurements, which can be of independent interest.  ... 
dblp:conf/icml/JangJYK21 fatcat:inhanrowvfazhemkk52tzk7wbq

Bayesian Unification of Gradient and Bandit-Based Learning for Accelerated Global Optimisation

Ole-Christoffer Granmo
2016 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)  
However, for continuous optimisation problems or problems with a large number of actions, bandit based approaches can be hindered by slow learning.  ...  Bandit based optimisation has a remarkable advantage over gradient based approaches due to their global perspective, which eliminates the danger of getting stuck at local optima.  ...  In [13] , the authors analysed several confidence interval based algorithms.  ... 
doi:10.1109/icmla.2016.0044 dblp:conf/icmla/Granmo16 fatcat:3ep5f5abnnho7awhdrgcchfjou

A multi-armed bandit approach for exploring partially observed networks

Kaushalya Madhawa, Tsuyoshi Murata
2019 Applied Network Science  
We formulate this problem as an exploration-exploitation problem and propose a novel nonparametric multi-armed bandit (MAB) algorithm for identifying which nodes to be queried.  ...  Conclusions: Our results demonstrate that multi-armed bandit based algorithms are well suited for exploring partially observed networks compared to heuristic based algorithms.  ...  Funding This work was supported by JSPS Grant-in-Aid for Scientific Research(B) (Grant Number 17H01785) and JST CREST (Grant Number JPMJCR1687).  ... 
doi:10.1007/s41109-019-0145-0 fatcat:u6hhlj3d7vbq7jx4247jjripqe

Exploring Partially Observed Networks with Nonparametric Bandits [article]

Kaushalya Madhawa, Tsuyoshi Murata
2018 arXiv   pre-print
We formulate this problem as an exploration-exploitation problem and propose a novel nonparametric multi-arm bandit (MAB) algorithm for identifying which nodes to be queried.  ...  Our contributions include: (1) iKNN-UCB, a novel nonparametric MAB algorithm, applies k-nearest neighbor UCB to the setting when the arms are presented in a vector space, (2) provide theoretical guarantee  ...  We proposed a novel nonparametric multi-armed bandit algorithm iKNN-UCB with sublinear regret.  ... 
arXiv:1804.07059v1 fatcat:te63sbc3w5atbkhykqvrzb5vni

Bernoulli Rank-1 Bandits for Click Feedback

Sumeet Katariya, Branislav Kveton, Csaba Szepesvári, Claire Vernade, Zheng Wen
2017 Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence  
This is a special case of the stochastic rank-1 bandit problem considered in recent work that proposed an elimination based algorithm Rank1Elim, and showed that Rank1Elim's regret scales linearly with  ...  With the help of a novel result concerning the scaling of KL divergences we prove that with this change, our algorithm will be competitive no matter the value of mu.  ...  However, a naive bandit algorithm that ignores the rank-1 structure and treats each row-column pair as unrelated arms has O(K 2 log n) regret. 1 While a naive bandit algorithm is unable to exploit the  ... 
doi:10.24963/ijcai.2017/278 dblp:conf/ijcai/KatariyaKSVW17 fatcat:u3y3vrmb45h6hfl7j2hrdxfciu

Quadratic Sparse Gaussian Graphical Model Estimation Method for Massive Variables

Jiaqi Zhang, Meng Wang, Qinchi Li, Sen Wang, Xiaojun Chang, Beilun Wang
2020 Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence  
We consider the problem of estimating a sparse Gaussian Graphical Model with a special graph topological structure and more than a million variables.  ...  To overcome this challenge, we propose a novel method, called Fast and Scalable Inverse Covariance Estimator by Thresholding (FST).  ...  Acknowledgments This work was partially supported by NSFC (61976112), NSFC-NRF Joint Research Project (61861146001), and the Collaborative Innovation Center of Novel Software Technology and Industrialization  ... 
doi:10.24963/ijcai.2020/406 dblp:conf/ijcai/XueWWZ20 fatcat:p62k4vmw7fg7tcq6ifd3o47tza

Greedy Confidence Pursuit: A Pragmatic Approach to Multi-bandit Optimization [chapter]

Philip Bachman, Doina Precup
2013 Lecture Notes in Computer Science  
arms (with high confidence) for as many of the bandits as possible.  ...  To solve this problem, which we call greedy confidence pursuit, we develop a method based on posterior sampling.  ...  Note that MAP-UCB is a novel algorithm which we have introduced to provide non-trivial competition for GCP-Bayes.  ... 
doi:10.1007/978-3-642-40988-2_16 fatcat:3gp4ikabnvgvrjdaetq4x3th64

Identifying Outlier Arms in Multi-Armed Bandit

Honglei Zhuang, Chi Wang, Yifan Wang
2017 Neural Information Processing Systems  
We study a novel problem lying at the intersection of two areas: multi-armed bandit and outlier detection.  ...  Multi-armed bandit is a useful tool to model the process of incrementally collecting data for multiple objects in a decision space.  ...  For both of our algorithms, we derived the confidence intervals based on Bernoulli distribution.  ... 
dblp:conf/nips/Zhuang0W17 fatcat:xi7txe4banfvbcek2ewaatd44a

Combinatorial Pure Exploration of Multi-Armed Bandits

Shouyuan Chen, Tian Lin, Irwin King, Michael R. Lyu, Wei Chen
2014 Neural Information Processing Systems  
We present general learning algorithms which work for all decision classes that admit offline maximization oracles in both fixed confidence and fixed budget settings.  ...  The CPE problem represents a rich class of pure exploration tasks which covers not only many existing models but also novel cases where the object of interest has a nontrivial combinatorial structure.  ...  In this paper, we propose two novel learning algorithms for general CPE problem: one for the fixed confidence setting and one for the fixed budget setting.  ... 
dblp:conf/nips/ChenLKLC14 fatcat:hfgiaso6qbdyflniwemtb5sxke

A Survey on Practical Applications of Multi-Armed and Contextual Bandits [article]

Djallel Bouneffouf, Irina Rish
2019 arXiv   pre-print
Specifically, we introduce a taxonomy of common MAB-based applications and summarize state-of-art for each of those domains.  ...  The multi-armed bandit field is currently flourishing, as novel problem settings and algorithms motivated by various practical applications are being introduced, building on top of the classical bandit  ...  For this, they develop a LinUCB-based bandit algorithm.  ... 
arXiv:1904.10040v1 fatcat:j6v37wy7f5bmvpfzzhtnutbeoa

Tuning Confidence Bound for Stochastic Bandits with Bandit Distance [article]

Xinyu Zhang, Srinjoy Das, Ken Kreutz-Delgado
2021 arXiv   pre-print
We propose a novel modification of the standard upper confidence bound (UCB) method for the stochastic multi-armed bandit (MAB) problem which tunes the confidence bound of a given bandit based on its distance  ...  Our UCB distance tuning (UCB-DT) formulation enables improved performance as measured by expected regret by preventing the MAB algorithm from focusing on non-optimal bandits which is a well-known deficiency  ...  We therefore believe that our analysis tool provides a novel and more intuitive perspective on analyzing UCB-based methods.  ... 
arXiv:2110.02690v1 fatcat:tzst3wjevbcyxevkkbokoqd37y

Sequential Monte Carlo Bandits [article]

Michael Cherkassky, Luke Bornn
2013 arXiv   pre-print
In this paper we propose a flexible and efficient framework for handling multi-armed bandits, combining sequential Monte Carlo algorithms with hierarchical Bayesian modeling techniques.  ...  The framework naturally encompasses restless bandits, contextual bandits, and other bandit variants under a single inferential model.  ...  For each sample in D, we ask the bandit algorithm for a recommended arm to play.  ... 
arXiv:1310.1404v1 fatcat:qk7bxmajsfcxbggrww3hdkdfdu

Content-based image retrieval with hierarchical Gaussian Process bandits with self-organizing maps

Ksenia Konyushkova, Dorota Glowacka
2013 The European Symposium on Artificial Neural Networks  
A content-based image retrieval system based on relevance feedback is proposed.  ...  The approach based on hierarchical Gaussian Process (GP) bandits is used to trade exploration and exploitation in presenting the images in each round.  ...  While this problem has been studied before (e.g. [1] ), we propose a novel computationally efficient approach based on a 2-level hierarchical Gaussian Process bandits.  ... 
dblp:conf/esann/KonyushkovaG13 fatcat:ss5en37kujdc3kj7xv55rrumme
« Previous Showing results 1 — 15 out of 2,601 results