Filters








186 Hits in 2.7 sec

Adapting to Misspecification in Contextual Bandits [article]

Dylan J. Foster and Claudio Gentile and Mehryar Mohri and Julian Zimmert
<span title="2021-07-12">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We introduce a new family of oracle-efficient algorithms for ε-misspecified contextual bandits that adapt to unknown model misspecification – both for finite and infinite action settings.  ...  Specializing to linear contextual bandits with infinite actions in d dimensions, we obtain the first algorithm that achieves the optimal O(d√(T) + ε√(d)T) regret bound for unknown misspecification level  ...  Adapting to Misspecification: An Oracle-Efficient Algorithm We now present our main result: an efficient reduction from contextual bandits to online regression that adapts to unknown misspecification ε  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2107.05745v1">arXiv:2107.05745v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/aapvoy6xovh4nd5lizacrwr5ai">fatcat:aapvoy6xovh4nd5lizacrwr5ai</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210715211049/https://arxiv.org/pdf/2107.05745v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/7d/db/7ddb8c8deabc8800ac80b6a478c306976aaa2065.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2107.05745v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Adapting to Misspecification in Contextual Bandits with Offline Regression Oracles [article]

Sanath Kumar Krishnamurthy, Vitor Hadad, Susan Athey
<span title="2021-06-11">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We propose a simple family of contextual bandit algorithms that adapt to misspecification error by reverting to a good safe policy when there is evidence that misspecification is causing a regret increase  ...  Our algorithm requires only an offline regression oracle to ensure regret guarantees that gracefully degrade in terms of a measure of the average misspecification level.  ...  W 3 := ∀t ∈ [T ], t t=1 π∈Ψ Q t (π)R(π) − r t (a t ) ≤ 2t ln m(t ) + log 2 (τ 1 ) 3 δ . (37) Adapting to Misspecification in Contextual Bandits Lemma 4.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2102.13240v2">arXiv:2102.13240v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ho4raxl7xfcjnbw5n2crzlnqzy">fatcat:ho4raxl7xfcjnbw5n2crzlnqzy</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210622141216/https://arxiv.org/pdf/2102.13240v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/6c/4f/6c4f209e2a378d3bc0b33bdb81906a6e6725e2f8.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2102.13240v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Balanced Linear Contextual Bandits

Maria Dimakopoulou, Zhengyuan Zhou, Susan Athey, Guido Imbens
<span title="2019-07-17">2019</span> <i title="Association for the Advancement of Artificial Intelligence (AAAI)"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/wtjcymhabjantmdtuptkk62mlq" style="color: black;">PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE</a> </i> &nbsp;
We develop algorithms for contextual bandits with linear payoffs that integrate balancing methods from the causal inference literature in their estimation to make it less prone to problems of estimation  ...  Contextual bandit algorithms are sensitive to the estimation method of the outcome model as well as the exploration method used, particularly in the presence of rich heterogeneity or complex outcome models  ...  This research is generously supported by ONR grant N00014-17-1-2131, by the Sloan Foundation, by the "Arvanitidis in Memory of William K.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1609/aaai.v33i01.33013445">doi:10.1609/aaai.v33i01.33013445</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/cvzn2dxls5akzgdweu57qy6co4">fatcat:cvzn2dxls5akzgdweu57qy6co4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200306093033/https://aaai.org/ojs/index.php/AAAI/article/download/4221/4099" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/d9/cf/d9cf55dbd5a3b2cb3f343ee292ba118049b5505b.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1609/aaai.v33i01.33013445"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Balanced Linear Contextual Bandits [article]

Maria Dimakopoulou, Zhengyuan Zhou, Susan Athey, Guido Imbens
<span title="2018-12-15">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We develop algorithms for contextual bandits with linear payoffs that integrate balancing methods from the causal inference literature in their estimation to make it less prone to problems of estimation  ...  Contextual bandit algorithms are sensitive to the estimation method of the outcome model as well as the exploration method used, particularly in the presence of rich heterogeneity or complex outcome models  ...  This research is generously supported by ONR grant N00014-17-1-2131, by the Sloan Foundation, by the "Arvanitidis in Memory of William K.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1812.06227v1">arXiv:1812.06227v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/fjvhmzl3kzb3zfpehdz65aipzu">fatcat:fjvhmzl3kzb3zfpehdz65aipzu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200929144954/https://arxiv.org/pdf/1812.06227v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/d3/68/d3687c855977095204baa969335ff4177613bbea.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1812.06227v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Bootstrap Thompson Sampling and Sequential Decision Problems in the Behavioral Sciences

Dean Eckles, Maurits Kaptein
<span title="">2019</span> <i title="SAGE Publications"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/xzpnzbjdv5chrjoc6ib7gpmw4i" style="color: black;">SAGE Open</a> </i> &nbsp;
We illustrate its robustness to model misspecification, which is a common concern in behavioral science applications.  ...  We show how BTS can be readily adapted to be robust to dependent data, such as repeated observations of the same units, which is common in behavioral science applications.  ...  Dean Eckles contributed to earlier versions of this article, while being an employee of Facebook, Inc.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1177/2158244019851675">doi:10.1177/2158244019851675</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/xyqgfmxkm5hsfbnztj7v2nvi2y">fatcat:xyqgfmxkm5hsfbnztj7v2nvi2y</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200508110222/https://pure.uvt.nl/ws/portalfiles/portal/32386196/MTO_Kaptein_bootstrap_Thompson_sampling_Sage_Open_2019.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/07/b4/07b430047f4f1484f1542ead3fd8b66e79678b4c.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1177/2158244019851675"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> sagepub.com </button> </a>

Misspecified Gaussian Process Bandit Optimization [article]

Ilija Bogunovic, Andreas Krause
<span title="2021-11-09">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
elimination-type algorithm that can adapt to unknown model misspecification.  ...  In addition, in a stochastic contextual setting, we show that EC-GP-UCB can be effectively combined with the regret bound balancing strategy and attain similar regret bounds despite not knowing ϵ.  ...  Several works have recently considered the misspecified contextual linear bandit problem with unknown model misspecification ǫ.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2111.05008v1">arXiv:2111.05008v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/2rvvzi3iejcq3funb5onzxwuy4">fatcat:2rvvzi3iejcq3funb5onzxwuy4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20211113021142/https://arxiv.org/pdf/2111.05008v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/31/d3/31d3916cb5ce9f1acd23670741eebfda7458ae39.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2111.05008v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Universal and data-adaptive algorithms for model selection in linear contextual bandits [article]

Vidya Muthukumar, Akshay Krishnamurthy
<span title="2021-11-08">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Model selection in contextual bandits is an important complementary problem to regret minimization with respect to a fixed model class.  ...  Our approach extends to model selection among nested linear contextual bandits under some additional assumptions.  ...  This work was done in part while the authors were visiting the Simons Institute for the Theory of Computing.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2111.04688v1">arXiv:2111.04688v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/vdqjdciaibh3larpp6l6sl5bra">fatcat:vdqjdciaibh3larpp6l6sl5bra</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20211112013733/https://arxiv.org/pdf/2111.04688v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/11/bc/11bc17dfa534b43183d68901ad32b55154d3138b.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2111.04688v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Latent Bandits Revisited [article]

Joey Hong and Branislav Kveton and Manzil Zaheer and Yinlam Chow and Amr Ahmed and Craig Boutilier
<span title="2020-06-15">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Our methods are contextual and aware of model uncertainty and misspecification.  ...  A latent bandit problem is one in which the learning agent knows the arm reward distributions conditioned on an unknown discrete latent state.  ...  [15] propose a unified framework that adapts classic bandit algorithms, such as UCB and TS, to the multi-arm structured bandit setting.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2006.08714v1">arXiv:2006.08714v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/njob7vr4knha3prkz7pnzsrhrm">fatcat:njob7vr4knha3prkz7pnzsrhrm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200623112313/https://arxiv.org/pdf/2006.08714v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2006.08714v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit [article]

Giuseppe Burtini, Jason Loeppky, Ramon Lawrence
<span title="2015-11-03">2015</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Adaptive and sequential experiment design is a well-studied area in numerous domains.  ...  We first explore the traditional stochastic model of a multi-armed bandit, then explore a taxonomic scheme of complications to that model, for each complication relating it to a specific requirement or  ...  Adaptive dose finding designs can be seen as a special case of continuum-armed bandits and biomarker-adaptive designs can be seen as a special case of contextual bandits, where biomarkers are observed  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1510.00757v4">arXiv:1510.00757v4</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/eyxqdq3yl5fpdbv53wtnkfa25a">fatcat:eyxqdq3yl5fpdbv53wtnkfa25a</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200829193347/https://arxiv.org/pdf/1510.00757v4.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/78/05/78055dd235b545cf5e4e23fa9b7dbedd4e10ab21.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1510.00757v4" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Efficient and Robust Algorithms for Adversarial Linear Contextual Bandits [article]

Gergely Neu, Julia Olkhovskaya
<span title="2022-05-24">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Our second algorithm, RobustLinExp3, is shown to be robust to misspecification, in that it achieves a regret bound of O((Kd)^1/3T^2/3) + ε√(d) T if the true reward function is linear up to an additive  ...  We consider an adversarial variant of the classic K-armed linear contextual bandit problem where the sequence of loss functions associated with each arm are allowed to change without restriction over time  ...  Acknowledgments We would like to thank Haipeng Luo, Chen-Yu Wei and Chung-Wei Lee for pointing out a technical issue with an earlier version of our proof of Lemma 6, and we thank Wojciech Kotłowski for  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2002.00287v3">arXiv:2002.00287v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/fx7iy4tierfrvd65kv7s4j5yma">fatcat:fx7iy4tierfrvd65kv7s4j5yma</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220526115259/https://arxiv.org/pdf/2002.00287v3.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/7b/e4/7be452949701d4713507e236c1c2a848668ef0fb.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2002.00287v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles [article]

Dylan J. Foster, Alexander Rakhlin
<span title="2020-06-23">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We provide the first universal and optimal reduction from contextual bandits to online regression.  ...  A fundamental challenge in contextual bandits is to develop flexible, general-purpose algorithms with computational requirements no worse than classical supervised learning tasks such as classification  ...  This proof uses arguments similar to a lower bound against strongly adaptive regret for (non-contextual) multi-armed bandits in given in Daniely et al. (2015a) .  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2002.04926v2">arXiv:2002.04926v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/5iepwb62fjbmjjgd4y2bi6utku">fatcat:5iepwb62fjbmjjgd4y2bi6utku</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200630214710/https://arxiv.org/pdf/2002.04926v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ac/e4/ace4a37699d393f27cc01542c48b6b1388edd2bd.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2002.04926v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Thompson sampling with the online bootstrap [article]

Dean Eckles, Maurits Kaptein
<span title="2014-10-15">2014</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Thompson sampling provides a solution to bandit problems in which new observations are allocated to arms with the posterior probability that an arm is optimal.  ...  We first explain BTS and show that the performance of BTS is competitive to Thompson sampling in the well-studied Bernoulli bandit case.  ...  The Gaussian bandit illustrated robustness of BTS in cases of model misspecification.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1410.4009v1">arXiv:1410.4009v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/gj5ba3o2ozabjbp36l73v2nf3q">fatcat:gj5ba3o2ozabjbp36l73v2nf3q</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200907074022/https://arxiv.org/pdf/1410.4009v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ab/6d/ab6dfa60a5ed25ee24ba8a9531f085316ccbf661.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1410.4009v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Estimation Considerations in Contextual Bandits [article]

Maria Dimakopoulou, Zhengyuan Zhou, Susan Athey, Guido Imbens
<span title="2018-12-16">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We develop parametric and non-parametric contextual bandits that integrate balancing methods from the causal inference literature in their estimation to make it less prone to problems of estimation bias  ...  Contextual bandit algorithms are sensitive to the estimation method of the outcome model as well as the exploration method used, particularly in the presence of rich heterogeneity or complex outcome models  ...  Linvill" Stanford Graduate Fellowship in Science & Engineering and by the Onassis Foundation.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1711.07077v4">arXiv:1711.07077v4</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/2eqiwj6ljjdsle7svb74nbpj7u">fatcat:2eqiwj6ljjdsle7svb74nbpj7u</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200913235440/https://arxiv.org/pdf/1711.07077v4.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/a9/cc/a9cc2e1cc31f1b9839d2a847b75df7512ce543a4.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1711.07077v4" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Regret Bound Balancing and Elimination for Model Selection in Bandits and RL [article]

Aldo Pacchiano, Christoph Dann, Claudio Gentile, Peter Bartlett
<span title="2020-12-24">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
This factor is reasonably small in several applications, including linear bandits and MDPs with nested function classes, linear bandits with unknown misspecification, and LinUCB applied to linear bandits  ...  Finally, unlike recent efforts in model selection for linear stochastic bandits, our approach is versatile enough to also cover cases where the context information is generated by an adversarial environment  ...  ., Foster et al. [2020] achieve optimal rates for selecting the the misspecification level in the setting of contextual linear bandits.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2012.13045v1">arXiv:2012.13045v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/qjjf57epszdkrazhupxyrbilqu">fatcat:qjjf57epszdkrazhupxyrbilqu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201228224529/https://arxiv.org/pdf/2012.13045v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/56/0f/560f6277f345d6a3aa0004b1bb2b8e7e8fe985df.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2012.13045v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Online and Distribution-Free Robustness: Regression and Contextual Bandits with Huber Contamination [article]

Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau
<span title="2021-06-10">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We answer this question in the affirmative for both linear regression and contextual bandits. In fact our algorithms succeed where conventional methods fail.  ...  In this work we revisit two classic high-dimensional online learning problems, namely linear regression and contextual bandits, from the perspective of adversarial robustness.  ...  Acknowledgments We thank Ainesh Bakshi and Dylan Foster for useful discussions related to their papers, [BP20] and [FR20] , respectively.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2010.04157v3">arXiv:2010.04157v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/h2bskgaudvbqhjcer7m3t4wfne">fatcat:h2bskgaudvbqhjcer7m3t4wfne</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201018162342/https://arxiv.org/pdf/2010.04157v1.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2010.04157v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 186 results