Filters








10 Hits in 2.9 sec

MOTS: Minimax Optimal Thompson Sampling [article]

Tianyuan Jin, Pan Xu, Jieming Shi, Xiaokui Xiao, Quanquan Gu
2020 arXiv   pre-print
To our knowledge, MOTS is the first Thompson sampling type algorithm that achieves the minimax optimality for multi-armed bandit problems.  ...  We prove that this simple variant of Thompson sampling achieves the minimax optimal regret bound O(√(KT)) for finite time horizon T, as well as the asymptotic optimal regret bound for Gaussian rewards  ...  Minimax Optimal Thompson Sampling Algorithm General Thompson sampling strategy We first describe the general Thompson sampling (TS) strategy.  ... 
arXiv:2003.01803v3 fatcat:wjtbnm2yincg5dxusu4fziphjm

Finite-Time Regret of Thompson Sampling Algorithms for Exponential Family Multi-Armed Bandits [article]

Tianyuan Jin, Pan Xu, Xiaokui Xiao, Anima Anandkumar
2022 arXiv   pre-print
We propose a Thompson sampling algorithm, termed ExpTS, which uses a novel sampling distribution to avoid the under-estimation of the optimal arm.  ...  ExpTS^+ is an anytime bandit algorithm and achieves the minimax optimality and asymptotic optimality simultaneously for exponential family reward distributions.  ...  It is worth noting that MOTS proposed by Jin et al. (2021b) (Thompson sampling with a clipped Gaussian posterior) also achieves the minimax optimal regret for sub-Gaussian rewards, but it can not keep  ... 
arXiv:2206.03520v1 fatcat:dmm5tl7gw5ccfm4ssxmzro7ui4

Bandit Algorithms for Precision Medicine [article]

Yangyi Lu, Ziping Xu, Ambuj Tewari
2021 arXiv   pre-print
Bandit algorithms can be used in any situation where treatment decisions need to be made to optimize some health outcome.  ...  The Oxford English Dictionary defines precision medicine as "medical care designed to optimize efficiency or therapeutic benefit for particular groups of patients, especially by using genetic or molecular  ...  Jin et al. (2020) proposed a variant of TS called MOTS (Minimax Optimal TS) that achieves O( √ KT ) regret.  ... 
arXiv:2108.04782v1 fatcat:dni5wyzyerestgs3upuzz776n4

Near-Optimal Randomized Exploration for Tabular Markov Decision Processes [article]

Zhihan Xiong, Ruoqi Shen, Qiwen Cui, Maryam Fazel, Simon S. Du
2022 arXiv   pre-print
Our result highlights that randomized exploration can be near-optimal, which was previously achieved only by optimistic algorithms.  ...  Mots: Minimax optimal thompson sampling. arXiv preprint arXiv:2003.01803, 2020. Sham M Kakade. J Zico Kolter and Andrew Y Ng. Near-bayesian exploration in polynomial time.  ...  Moreover, randomized value function has its origin from posterior sampling for reinforcement learning (Thompson sampling).  ... 
arXiv:2102.09703v4 fatcat:v3fvypmizffclfxemert5ps55i

Reference list of indexed articles

2000 Discrete Applied Mathematics  
Dwyer, Convex hulls of samples from spherically symmetric distributions 31 (1991) 113-132 784. P. Egyed and R.  ...  and E Hevery, A condition for the strong regularity of matrices in the minimax algebra 11 (1985) 209-222 307. D. Gale and M.  ... 
doi:10.1016/s0166-218x(00)00209-2 fatcat:l53yxlj45ffdrdecl3uah2gj3y

Book Reviews

1958 Journal of the American Statistical Association  
These include maximin utility or minimax loss, minimax risk or regret, the pessimism- optimism index, and the principle of insufficient reason.  ...  Mots, Resources for the Future HESE two volumes have much in common in addition to authorship and subject matter.  ... 
doi:10.1080/01621459.1958.10501476 fatcat:chcvoghyqjfctinppzfp4pvleu

Book of Abstracts Symposium President Board of Directors Editors

Alberto Leardini, Vincenzo Parenti-Castelli, Ugo Della, Croce, Rita Stagni, Andrea Cutti, Alberto Leardini, Rita Stagni
2012 XII International Symposium on 3D Analysis of Human Movement Technology & Treatment Bologna (Italy)   unpublished
Discussion A constrained minimax algorithm was utilized with a musculoskeletal model to estimate optimal shoulder immobilization postures following simulated surgical repair of rotator cuff tears.  ...  Use of a minimax algorithm effectively balanced these competing criteria.  ...  Two groups of methods are typically used: single body optimization and constrained multi-body optimization (MO).  ... 
fatcat:lomluaj4fjgbxminxnq7peolsi

Flood Management in a Complex River Basin with a Real-Time Decision Support System Based on Hydrological Forecasts

Javier García Hernández
2011
Step 5 Shuffle complexes: Combine the points in the evolved complexes into a single sample population; sort the sample population in order of increasing criterion value; re-partition or shuffle the sample  ...  Minimax, Maximax and Minimin criteria Regarding costs minimization, the pessimistic criterion, trying to maximize the minimum benefit, becomes the Minimax regret Criterion, which minimizes the maximum  ... 
doi:10.5075/epfl-thesis-5093 fatcat:3ljqaladkjatphfhezinjafrjm

The Integrability of Certain Functions and Related Summability Methods

I. J. Schoenberg
1959 The American mathematical monthly  
is, in fact, mot independent of the smaller primes. Let us try to modify (5) by considering only primes p not exceeding x'/?  ...  To prove the necessity let us assume that (1.3) does mot hold and let us show that f(x) is discontinuous at every trrational point x =é.  ... 
doi:10.1080/00029890.1959.11989303 fatcat:arq76rsr5zh47eabguhyiftdx4

SCIENTIFIC COMMITTEE INSTITUTE OF COMMAND

Andrzej Glen, Stanisław Zajas, -Deputy-Chairman, Waldemar Kaczmarek, Faculty, Waldemar Kitler, Stan Anton, Vasile Bucinschi, Assoc, Dipl, Pavel Eng, Bučka (+41 others)
2013 unpublished
The criterion of Savage's minimax regret Savage's. minimax regret criterion 14 is as pessimistic as Wald's. The method of actions (proceedings) differs from the one described above.  ...  According to the Chief Executive, Kisk Thompson, even the ideas which 'uplifted the company were developed in a team, and not only in Hunt's head.'  ... 
fatcat:qjoylhcf6jb4bapb44yh5nxmxi