Filters








1,869 Hits in 5.7 sec

A General Framework for Bandit Problems Beyond Cumulative Objectives [article]

Asaf Cassel
<span title="2021-10-26">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
The stochastic multi-armed bandit (MAB) problem is a common model for sequential decision problems.  ...  We provide a systematic approach to such problems, and derive general conditions under which the oracle policy is sufficiently tractable to facilitate the design of optimism-based (upper confidence bound  ...  A preliminary version of this work appeared at the Conference on Learning Theory, 2018.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1806.01380v3">arXiv:1806.01380v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/sdvne73f25cdfn64ayozw6g7k4">fatcat:sdvne73f25cdfn64ayozw6g7k4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20211102135314/https://arxiv.org/pdf/1806.01380v3.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/58/c1/58c1e083d4058f5cbba7ef4ba0d34b20a32b76fa.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1806.01380v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Risk-Averse Explore-Then-Commit Algorithms for Finite-Time Bandits [article]

Ali Yekkehkhany, Ebrahim Arian, Mohammad Hajiesmaili, Rakesh Nagi
<span title="2019-09-11">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
In this paper, we study multi-armed bandit problems in explore-then-commit setting.  ...  As compared to existing risk-averse bandit algorithms, our algorithms do not rely on hyper-parameters, resulting in a more robust behavior in practice, which is verified by the numerical evaluation.  ...  There are several criteria to measure and to model risk in a risk-averse multi-armed bandit problem. One of the common risk measurements is the mean-variance paradigm [27] .  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1904.13387v3">arXiv:1904.13387v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/d3o4h4fg5rfo3ff5lfff6qni5q">fatcat:d3o4h4fg5rfo3ff5lfff6qni5q</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200914021004/https://arxiv.org/pdf/1904.13387v3.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/a0/0c/a00c73c7496222c657eb5338d9f63534a3005c14.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1904.13387v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Adaptive Portfolio by Solving Multi-armed Bandit via Thompson Sampling [article]

Mengying Zhu, Xiaolin Zheng, Yan Wang, Yuyuan Li, Qianqiao Liang
<span title="2019-11-14">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Under these circumstances, we can use multiple classic strategies as multiple strategic arms in multi-armed bandit to naturally establish a connection with the portfolio selection problem.  ...  In this paper, we present a portfolio bandit strategy through Thompson sampling which aims to make online portfolio choices by effectively exploiting the performances among multiple arms.  ...  Multi-armed Bandit and Thompson Sampling This section contains theories, solutions for the multi-armed bandit problem and Thompson sampling.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1911.05309v2">arXiv:1911.05309v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ltcndm2ntvd4ndmqu3rf6samda">fatcat:ltcndm2ntvd4ndmqu3rf6samda</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200831164354/https://arxiv.org/pdf/1911.05309v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/96/64/966472685b446bf320717def8e4e3d8c67b51d14.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1911.05309v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

A Contextual-bandit-based Approach for Informed Decision-making in Clinical Trials [article]

Yogatheesan Varatharajah, Brent Berry, Sanmi Koyejo, Ravishankar Iyer
<span title="2018-09-01">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
The contextual-bandit and multi-arm bandit approaches provide 72.63 gains, respectively, compared to a random assignment.  ...  Recent efforts using multi-arm bandits, a type of reinforcement-learning methods, have focused on maximizing clinical outcomes for a population that was assumed to be homogeneous.  ...  Although the general approach of multi-arm bandits perfectly suits our goal, a limitation of the prior studies utilizing this approach is that they do not account for the inter-patient variability, e.g  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1809.00258v1">arXiv:1809.00258v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/qcaeqsatqfchrico55s3jg7hri">fatcat:qcaeqsatqfchrico55s3jg7hri</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20191020132812/https://arxiv.org/pdf/1809.00258v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/c1/6c/c16c1355cab50f5560082440693ce112848b60e1.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1809.00258v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users' Feedback [article]

Alexandre Letard, Tassadit Amghar, Olivier Camp, Nicolas Gutowski
<span title="2020-09-16">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Recent works on Multi-Armed Bandits (MAB) and Combinatorial Multi-Armed Bandits (COM-MAB) show good results on a global accuracy metric.  ...  Herein, we propose a novel approach reducing the number of explicit feedbacks required by Combinatorial Multi Armed bandit (COM-MAB) algorithms while providing similar levels of global accuracy and learning  ...  Combinatorial Multi-Armed Bandit The Combinatorial Multi-Armed Bandit problem can be seen as a generalization of the MAB and CMAB problems in which, at each iteration, a list of k arms, named "Super-Arm  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2009.07518v1">arXiv:2009.07518v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/hknfsllrczeylhsocryewgjsgi">fatcat:hknfsllrczeylhsocryewgjsgi</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200922090031/https://arxiv.org/pdf/2009.07518v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2009.07518v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Designing multi-objective multi-armed bandits algorithms: A study

Madalina M. Drugan, Ann Nowe
<span title="">2013</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/qm5nunzmyva4tfjekdcm34uvhq" style="color: black;">The 2013 International Joint Conference on Neural Networks (IJCNN)</a> </i> &nbsp;
We propose an algorithmic framework for multiobjective multi-armed bandits with multiple rewards.  ...  The standard UCB1 is extended to scalarized multi-objective UCB1 and we propose a Pareto UCB1 algorithm. Both algorithms are proven to have a logarithmic upper bound for their expected regret.  ...  Multi-armed bandits is a machine learning paradigm used to study and analyse resource allocation in stochastic and noisy environments.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ijcnn.2013.6707036">doi:10.1109/ijcnn.2013.6707036</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/ijcnn/DruganN13.html">dblp:conf/ijcnn/DruganN13</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/xjgu3eaigbduxls2x5redh3p5m">fatcat:xjgu3eaigbduxls2x5redh3p5m</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170808115015/http://ai.vub.ac.be/sites/default/files/MO_MAB_IJCNN_Accepted_v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/da/12/da1286dad2a277ced29283a738f541f4b8e83050.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ijcnn.2013.6707036"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Risk-Averse Action Selection Using Extreme Value Theory Estimates of the CVaR [article]

Dylan Troop, Frédéric Godin, Jia Yuan Yu
<span title="2020-12-10">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We finally show how the estimation procedure can be used in reinforcement learning by applying our method to the multi-arm bandit problem where the goal is to avoid catastrophic risk.  ...  Under appropriate conditions, we estimate the tail risk using a generalized Pareto distribution.  ...  We would like to thank Debbie J. Dupuis for her extremely valuable feedback.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1912.01718v2">arXiv:1912.01718v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/fvrr2gtqgfalnfxjrwzxcoqo3u">fatcat:fvrr2gtqgfalnfxjrwzxcoqo3u</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201212015809/https://arxiv.org/pdf/1912.01718v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/a0/c4/a0c4f5815770db9b712d5855a1ddf5fcad5dd5c0.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1912.01718v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Survey on Fair Reinforcement Learning: Theory and Practice [article]

Pratik Gajane, Akrati Saxena, Maryam Tavakol, George Fletcher, Mykola Pechenizkiy
<span title="2022-05-20">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
In this article, we provide an extensive overview of fairness approaches that have been implemented via a reinforcement learning (RL) framework.  ...  Fairness-aware learning aims at satisfying various fairness constraints in addition to the usual performance criteria via data-driven machine learning techniques.  ...  A partially observable MDP (POMDP) is a generalization of an MDP to model planning under uncertainty.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2205.10032v1">arXiv:2205.10032v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/rrc7a5aumnbe3dmptkeh5ohapa">fatcat:rrc7a5aumnbe3dmptkeh5ohapa</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220622093508/https://arxiv.org/pdf/2205.10032v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/f6/52/f652dd6add59e00f92245485011c4e128626a67e.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2205.10032v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Active learning for classification: An optimistic approach

Timothe Collet, Olivier Pietquin
<span title="">2014</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/jzxnjbkbyjgehoue7yr6hvesdu" style="color: black;">2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)</a> </i> &nbsp;
Experiments on a generic classification problem demonstrate that these new algorithms compare positively to state-of-the-art methods.  ...  Based on previous work on bandit theory applied to active learning for regression, we introduce four novel algorithms for solving the online allocation of the budget in a classification problem.  ...  Toward this, they model the problem under a multi-armed bandit setting, in which pulling an arm corresponds to taking a sample in one of the distributions.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/adprl.2014.7010610">doi:10.1109/adprl.2014.7010610</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/adprl/ColletP14.html">dblp:conf/adprl/ColletP14</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ce7qevp5yvhe7mhejstbvx52wa">fatcat:ce7qevp5yvhe7mhejstbvx52wa</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170706044842/http://www.cristal.univ-lille.fr/%7Epietquin/pdf/ADPRL_2014_OPTC.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/e9/14/e91467fe1da69944ee346f64246984a222f79b0d.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/adprl.2014.7010610"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit [article]

Giuseppe Burtini, Jason Loeppky, Ramon Lawrence
<span title="2015-11-03">2015</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We first explore the traditional stochastic model of a multi-armed bandit, then explore a taxonomic scheme of complications to that model, for each complication relating it to a specific requirement or  ...  We survey and synthesize the work of the online statistical learning paradigm referred to as multi-armed bandits integrating the existing research as a resource for a certain class of online experiments  ...  In algorithms designed for the multi-armed bandit, exploring suboptimal arms according to our objective criteria is non-desirable.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1510.00757v4">arXiv:1510.00757v4</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/eyxqdq3yl5fpdbv53wtnkfa25a">fatcat:eyxqdq3yl5fpdbv53wtnkfa25a</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200829193347/https://arxiv.org/pdf/1510.00757v4.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/78/05/78055dd235b545cf5e4e23fa9b7dbedd4e10ab21.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1510.00757v4" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Risk-Constrained Thompson Sampling for CVaR Bandits [article]

Joel Q. L. Chang, Qiuyu Zhu, Vincent Y. F. Tan
<span title="2021-02-04">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
The multi-armed bandit (MAB) problem is a ubiquitous decision-making problem that exemplifies the exploration-exploitation tradeoff. Standard formulations exclude risk in decision making.  ...  We explore the performance of a Thompson Sampling-based algorithm CVaR-TS under this risk measure.  ...  Cassel, A., Mannor, S., and Zeevi, A. A general approach to multi-armed bandits under risk criteria. In Proceedings of the 31st Conference On Learning Theory, pp. 1295-1306, 2018.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2011.08046v4">arXiv:2011.08046v4</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/hyjd5lrgwzdpxokermlhvrbfxy">fatcat:hyjd5lrgwzdpxokermlhvrbfxy</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210206005429/https://arxiv.org/pdf/2011.08046v4.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/8f/cc/8fccceafdbee7407b34357b31609dcb8bd2c4b73.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2011.08046v4" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Constrained regret minimization for multi-criterion multi-armed bandits [article]

Anmol Kagrecha, Jayakrishnan Nair, Krishna Jagannathan
<span title="2020-06-17">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We consider a stochastic multi-armed bandit setting and study the problem of regret minimization over a given time horizon, subject to a risk constraint.  ...  The proposed algorithm and analyses can be readily generalized to solve constrained multi-criterion optimization problems in the bandits setting.  ...  A general approach to multi-armed bandits under risk criteria. arXiv preprint arXiv:1806.01380, 2018. Yahel David and Nahum Shimkin. Pure exploration for max-quantile bandits.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2006.09649v1">arXiv:2006.09649v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/cvgk5alyungfddpkozw5gggi2i">fatcat:cvgk5alyungfddpkozw5gggi2i</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200623094316/https://arxiv.org/pdf/2006.09649v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2006.09649v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Risk-Aware Algorithms for Combinatorial Semi-Bandits [article]

Shaarad Ayyagari, Ambedkar Dukkipati
<span title="2021-12-02">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
In this paper, we study the stochastic combinatorial multi-armed bandit problem under semi-bandit feedback.  ...  While much work has been done on algorithms that optimize the expected reward for linear as well as some general reward functions, we study a variant of the problem, where the objective is to be risk-aware  ...  Risk-awareness has first been studied for multi-armed bandits under the mean-variance criterion (Sani et al., 2012) and the MIN and CVaR criteria (Galichet et al., 2013) for bounded rewards.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2112.01141v1">arXiv:2112.01141v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/mzus6r6e4rbwtopboxys22z6pu">fatcat:mzus6r6e4rbwtopboxys22z6pu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20211204035457/https://arxiv.org/pdf/2112.01141v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ee/cc/eecc7b123a2aa44d878e241094f042c991f95a41.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2112.01141v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Exploration vs Exploitation vs Safety: Risk-averse Multi-Armed Bandits [article]

Nicolas Galichet, Michèle Sebag, Olivier Teytaud (LRI, INRIA Saclay - Ile de France)
<span title="2014-01-06">2014</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
When the user-supplied risk level goes to 0, the arm quality tends toward the essential infimum of the arm distribution density, and MARAB tends toward the MIN multi-armed bandit algorithm, aimed at the  ...  Motivated by applications in energy management, this paper presents the Multi-Armed Risk-Aware Bandit (MARAB) algorithm.  ...  Acknowledgments We are grateful to J.-J. Christophe, J. Decock and the members of the Ilab Metis and Artelys, for fruitful collaboration. We thank the anonymous referees for their insightful comments.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1401.1123v1">arXiv:1401.1123v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/zq7nsfw6pzcoffcdcrwk7zyaey">fatcat:zq7nsfw6pzcoffcdcrwk7zyaey</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200822031518/https://arxiv.org/pdf/1401.1123v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/f9/7e/f97efb7b792a9109380c0da4d3a1b4ee6d892e91.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1401.1123v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Building Bridges: Viewing Active Learning from the Multi-Armed Bandit Lens [article]

Ravi Ganti, Alexander G. Gray
<span title="2013-09-26">2013</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
In this paper we propose a multi-armed bandit inspired, pool based active learning algorithm for the problem of binary classification.  ...  By carefully constructing an analogy between active learning and multi-armed bandits, we utilize ideas such as lower confidence bounds, and self-concordant regularization from the multi-armed bandit literature  ...  The MAB problem is a B round game, where in a generic round t, the player has to pull one among k arms of a multi-armed bandit. On doing so the player suffers a loss L t .  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1309.6830v1">arXiv:1309.6830v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/zn5m4oibtbazlp7xvdutcagdrm">fatcat:zn5m4oibtbazlp7xvdutcagdrm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200929140747/https://arxiv.org/ftp/arxiv/papers/1309/1309.6830.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/0c/a1/0ca19ac66ec7157a6e209baa3327a2f500a9ff0c.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1309.6830v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 1,869 results