1,457 Hits in 3.6 sec

Solving Multi-Arm Bandit Using a Few Bits of Communication [article]

Osama A. Hanna, Lin F. Yang, Christina Fragouli
2021 arXiv   pre-print
The multi-armed bandit (MAB) problem is an active learning framework that aims to select the best among a set of actions by sequentially observing rewards.  ...  only a few (as low as 3) bits to be sent per iteration while preserving the same regret bound.  ...  to a small constant factor, while using a few bits of communication.  ... 
arXiv:2111.06067v1 fatcat:4634ecqi65e2xobf3or3j676ay

User pairing using laser chaos decision maker for NOMA systems

Zengchao Duan, Aohan Li, Norihiro Okada, Yusuke Ito, Nicolas Chauvet, Makoto Naruse, Mikio Hasegawa
2022 Nonlinear Theory and Its Applications IEICE  
In the meantime, ultrafast methods of solving multi-armed bandit problems have been developed using chaotic laser time series.  ...  In this paper, we consider the user pairing problem in Non-Orthogonal Multiple Access as a multi-armed bandit problem and propose an ultra-fast user pairing algorithm based on the laser chaos decision  ...  Introduction The ultrafast decision-maker has been demonstrated to resolve multi-armed-bandit (MAB) problems at the GHz order using chaotic laser time series [1, 2] .  ... 
doi:10.1587/nolta.13.72 fatcat:maqdxrr2drczjk4iymfhuoliqy

Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication [article]

Yuanhao Wang, Jiachen Hu, Xiaoyu Chen, Liwei Wang
2019 arXiv   pre-print
For distributed multi-armed bandits, we propose a protocol with near-optimal regret and only O(Mlog(MK)) communication cost, where K is the number of arms.  ...  The communication cost is independent of the time horizon T, has only logarithmic dependence on the number of arms, and matches the lower bound except for a logarithmic factor.  ...  A UCB-based Protocol for Multi-armed Bandits In classic multi-armed bandit problems, upper confidence bound (UCB) algorithms are very efficient in solving regret minimization problem.  ... 
arXiv:1904.06309v2 fatcat:mukrdozn5bddnpkd5a3lkrbq5y

Network of Bandits insure Privacy of end-users [article]

Raphaël Féraud
2017 arXiv   pre-print
We provide a first algorithm, Distributed Median Elimination, which is optimal in term of number of transmitted bits and near optimal in term of speed-up factor with respect to an optimal algorithm run  ...  In order to distribute the best arm identification task as close as possible to the user's devices, on the edge of the Radio Access Network, we propose a new problem setting, where distributed players  ...  Recent years have seen an increasing interest for the study of the collaborative distribution scheme: N players collaborate to solve a multi-armed bandit problem.  ... 
arXiv:1602.03779v14 fatcat:4b57prgxnfgmlgiegoytikcz3u

Meta-learning of Exploration/Exploitation Strategies: The Multi-armed Bandit Case [chapter]

Francis Maes, Louis Wehenkel, Damien Ernst
2013 Communications in Computer and Information Science  
The exploration/exploitation (E/E) dilemma arises naturally in many subfields of Science. Multi-armed bandit problems formalize this dilemma in its canonical form.  ...  a large hypothesis space of candidate E/E strategies; and (iii), solve an optimization problem to find a candidate E/E strategy of maximal average performance over a sample of problems drawn from the  ...  Having a linear dependency between n p and d is a classical choice when using EDAs [14] . Note that, in most cases the optimization is solved in a few or a few tens iterations.  ... 
doi:10.1007/978-3-642-36907-0_7 fatcat:helibxckl5da3gw7m43k5mlck4

Faster Activity and Data Detection in Massive Random Access: A Multi-armed Bandit Approach [article]

Jialin Dong, Jun Zhang, Yuanming Shi, Jessie Hui Wang
2020 arXiv   pre-print
To further improve the convergence rate, an inner multi-armed bandit problem is established to learn the exploration policy of Bernoulli sampling.  ...  In this paper, we develop multi-armed bandit approaches for more efficient detection via coordinate descent, which make a delicate trade-off between exploration and exploitation in coordinate selection  ...  Due to the sporadic communications, only a few devices are active out of all devices at a given time instant [22] .  ... 
arXiv:2001.10237v1 fatcat:3joodcctcbajhhb3nw2yuk3oou

A program for sequential allocation of three Bernoulli populations

Janis Hardwick, Robert Oehmke, Quentin F Stout
1999 Computational Statistics & Data Analysis  
As an illustration, the program is used to create an adaptive sampling procedure that is the optimal solution to a 3-arm bandit problem.  ...  Extensions enabling the program to solve a variety of related problems are discussed.  ...  Computational support was provided by the Center for Parallel Computing at the University of Michigan. We are grateful for the comments of three referees who reviewed this paper.  ... 
doi:10.1016/s0167-9473(99)00039-0 fatcat:x67x6276xzfutf2vdan4w53lne

Adaptive decision making using a chaotic semiconductor laser for multi-armed bandit problem with time-varying hit probabilities

Akihiro Oda, Takatomo Mihana, Kazutaka Kanno, Makoto Naruse, Atsushi Uchida
2022 Nonlinear Theory and Its Applications IEICE  
We numerically demonstrate the principle of adaptive decision making for solving multi-armed bandit problems in dynamically changing reward environments.  ...  We use the tugof-war method by comparing a threshold and a chaotic temporal waveform generated from a semiconductor laser observed in an experiment.  ...  Acknowledgments We acknowledge the support of the Japan Society for the Promotion of Science (JP19H00868, JP20K15185, and JP20H00233), JST CREST (JPMJCR17N2), and the Telecommunications Advancement Foundation  ... 
doi:10.1587/nolta.13.112 fatcat:hqqv3jiiyjbnbp2wq6vy3mbnru

Multi-Agent Multi-Armed Bandits with Limited Communication [article]

Mridul Agarwal, Vaneet Aggarwal, Kamyar Azizzadenesheli
2021 arXiv   pre-print
We consider the problem where N agents collaboratively interact with an instance of a stochastic K arm bandit problem for K ≫ N.  ...  The agents aim to simultaneously minimize the cumulative regret over all the agents for a total of T time steps, the number of communication rounds, and the number of bits in each communication round.  ...  INTRODUCTION We consider a setup where N agents connected over a network, interact with a multi armed bandit (MAB) environment (Lattimore and Szepesvári, 2020) .  ... 
arXiv:2102.08462v1 fatcat:nocqx3l7fnes5hrytmutbmxjqm

Study of Multi-Armed Bandits for Energy Conservation in Cognitive Radio Sensor Networks

Juan Zhang, Hong Jiang, Zhenhua Huang, Chunmei Chen, Hesong Jiang
2015 Sensors  
In order to achieve this goal, the paper introduces the concept of a bounded MAB to find the optimal packet size to transfer by formulating different packet sizes for different arms under the channel condition  ...  Technological advances have led to the emergence of wireless sensor nodes in wireless networks. Sensor nodes are usually battery powered and hence have strict energy constraints.  ...  Conflicts of Interest The authors declare no conflict of interest.  ... 
doi:10.3390/s150409360 pmid:25905702 pmcid:PMC4431283 fatcat:td7lkevqhratxgm5c6wbuoaatm

Stochastic Contextual Bandits with Known Reward Functions [article]

Pranav Sakulkar, Bhaskar Krishnamachari
2016 arXiv   pre-print
Many sequential decision-making problems in communication networks can be modeled as contextual bandit problems, which are natural extensions of the well-known multi-armed bandit problem.  ...  of arms.  ...  When the number of bits to be sent at each time are finite, this represents the case of discrete contextual bandits considered in this paper. 2) Energy Harvesting Communications: Consider a power-aware  ... 
arXiv:1605.00176v2 fatcat:hefngf7i6ffddoopp27wzrlegi

Regret Minimisation in Multi-Armed Bandits Using Bounded Arm Memory [article]

Arghya Roy Chaudhuri, Shivaram Kalyanakrishnan
2019 arXiv   pre-print
Designing an efficient regret minimisation algorithm that uses a constant number of words has long been interesting to the community.  ...  In this paper, we propose a constant word (RAM model) algorithm for regret minimisation for both finite and infinite Stochastic Multi-Armed Bandit (MAB) instances.  ...  We find that providing a lower bound on the cumulative regret under the bounded arm memory constraint is an interesting question, and we leave that for future investigation.  ... 
arXiv:1901.08387v1 fatcat:7a6fflsa4fgl3iqdsbxsy4wzxq

Optimal Transmission Rate Control Policies in a Wireless Link Under Partial State Information

I. Koutsopoulos, L. Tassiulas
2010 IEEE Transactions on Automatic Control  
The policy admits a simple interpretation: increase rate when the number of successive ACKs exceeds a threshold, and decrease rate when the number of successive NACKs exceeds a threshold.  ...  We consider the problem of PHY layer transmission rate control for maximum throughput on a wireless link over a finite time horizon.  ...  , or a degenerate case of a multi-armed bandit problem.  ... 
doi:10.1109/tac.2009.2033839 fatcat:2gsxwt5nnff3phwwbh3xilo6ha

Remote Contextual Bandits [article]

Francesco Pase, Deniz Gunduz, Michele Zorzi
2022 arXiv   pre-print
We consider a remote contextual multi-armed bandit (CMAB) problem, in which the decision-maker observes the context and the reward, but must communicate the actions to be taken by the agents over a rate-limited  ...  In this remote CMAB (R-CMAB) problem, the constraint on the communication rate between the decision-maker and the agents imposes a trade-off between the number of bits sent per agent and the acquired average  ...  This formulation is different from the existing results in the literature involving multi-agent multi-armed bandit (MAB). In [16] , each agent can pull an arm and communicate with others.  ... 
arXiv:2202.05182v1 fatcat:efdjgux6xfbxbmvoronut2scie

Versatile Dueling Bandits: Best-of-both-World Analyses for Online Learning from Preferences [article]

Aadirupa Saha, Pierre Gaillard
2022 arXiv   pre-print
In summary, we believe our reduction idea will find a broader scope in solving a diverse class of dueling bandits setting, which are otherwise studied separately from multi-armed bandits with often more  ...  We first propose a novel reduction from any (general) dueling bandits to multi-armed bandits and despite the simplicity, it allows us to improve many existing results in dueling bandits.  ...  Acknowledgment Thanks to Julian Zimmert and Karan Singh for the useful discussions on the existing best-of-both-world multiarmed bandits results.  ... 
arXiv:2202.06694v1 fatcat:hd2j4clntzafzhcdjsndsek3gq
« Previous Showing results 1 — 15 out of 1,457 results