3,280 Hits in 2.4 sec

Learning Minimax Estimators via Online Learning [article]

Kartik Gupta, Arun Sai Suggala, Adarsh Prasad, Praneeth Netrapalli, Pradeep Ravikumar
2020 arXiv   pre-print
By leveraging recent results in online learning with non-convex losses, we provide a general algorithm for finding a mixed-strategy Nash equilibrium of general non-convex non-concave zero-sum games.  ...  We consider the problem of designing minimax estimators for estimating the parameters of a probability distribution.  ...  Minimax Estimation via Online Learning In this section, we present our algorithm for computing a mixed strategy NE of the statistical game in Equation (1) (equivalently a pure strategy NE of the linearized  ... 
arXiv:2006.11430v1 fatcat:ieqgr3bfcrcw7nm4c5z6mthuuu

Minimax Optimal Online Imitation Learning via Replay Estimation [article]

Gokul Swamy, Nived Rajaraman, Matthew Peng, Sanjiban Choudhury, J. Andrew Bagnell, Zhiwei Steven Wu, Jiantao Jiao, Kannan Ramchandran
2022 arXiv   pre-print
Online imitation learning is the problem of how best to mimic expert demonstrations, given access to the environment or an accurate simulator.  ...  estimate to match.  ...  It turns out it is indeed possible to do so, via the technique of replay estimation.  ... 
arXiv:2205.15397v3 fatcat:kmxhqpukvbfwpel3nydmgn2v7m

An Optimal Reduction of TV-Denoising to Adaptive Online Learning [article]

Dheeraj Baby and Xuandong Zhao and Yu-Xiang Wang
2021 arXiv   pre-print
We reveal a deep connection to the seemingly disparate problem of Strongly Adaptive online learning (Daniely et al, 2015) and provide an O(n log n) time algorithm that attains the near minimax optimal  ...  This leads to a new and more versatile alternative to wavelets-based methods for (1) adaptively estimating TV bounded functions; (2) online forecasting of TV bounded trends in time series.  ...  Smoothed online convex optimization in high dimensions via online balanced descent. In Conference on Learning Theory (COLT-18), 2018a. Xi Chen, Yining Wang, and Yu-Xiang Wang.  ... 
arXiv:2101.09438v2 fatcat:pvjqmdemrbfedmw72nzddej2xa

Bootstrapping from Game Tree Search

Joel Veness, David Silver, William T. B. Uther, Alan Blair
2009 Neural Information Processing Systems  
When tested online against human opponents, Meep played at a master level, the best performance of any chess program with a heuristic learned entirely from self-play.  ...  After initialising its weight vector to small random values, Meep was able to learn high quality weights from self-play alone.  ...  The online rating of the heuristic learned by self-play corresponds to weak master level play.  ... 
dblp:conf/nips/VenessSUB09 fatcat:socqd2txnzexdb4u45q3aiybf4

Table of contents

2021 IEEE Transactions on Information Theory  
Su Algorithmic Analysis and Statistical Estimation of SLOPE via Approximate Message Passing 506 PROBABILITY AND STATISTICS G.  ...  Loh Teaching and Learning in Uncertainty 598 D. Yuan, A. Proutiere, and G. Shi Distributed Online Linear Regressions 616 SIGNAL PROCESSING X. Ding and H.-T.  ... 
doi:10.1109/tit.2020.3044200 fatcat:rd64gmkrlve7fih3hkvq5c4z4i

Minimax Optimal Online Stochastic Learning for Sequences of Convex Functions under Sub-Gradient Observation Failures [article]

Hakan Gokcesu, Suleyman S. Kozat
2019 arXiv   pre-print
We study online convex optimization under stochastic sub-gradient observation faults, where we introduce adaptive algorithms with minimax optimal regret guarantees.  ...  For such a scenario, we propose a blind algorithm that estimates these properties empirically in a generally applicable manner.  ...  INTRODUCTION In online learning, a parameter vector of interest is optimized sequentially based on the feedback coming from the environment.  ... 
arXiv:1904.09369v1 fatcat:3g4ffnit5rdepjfue5kebnw32q

Faster Rates in Regression via Active Learning

Rui M. Castro, Rebecca Willett, Robert D. Nowak
2005 Neural Information Processing Systems  
Active learning algorithms are able to make queries or select sample locations in an online fashion, depending on the results of the previous queries.  ...  Our active learning theory and methods show promise in a number of applications, including field estimation using wireless sensor networks and fault line detection.  ...  There exist practical passive learning strategies that are near-minimax optimal.  ... 
dblp:conf/nips/CastroWN05 fatcat:3wsextu5hvda3duq7hyteftway

Online Label Aggregation: A Variational Bayesian Approach [article]

Chi Hong, Amirmasoud Ghiassi, Yichi Zhou, Robert Birke, Lydia Y. Chen
2020 arXiv   pre-print
To ensure the time relevance and overcome slow responses of workers, online label aggregation is increasingly requested, calling for solutions that can incrementally infer true label distribution via subsets  ...  We compare BiLA with the state of the art based on minimax entropy, neural networks and expectation maximization algorithms, on synthetic and real-world data sets.  ...  It uses a minimax conditional entropy approach to jointly estimate both worker and item matrices. • Label Aware Autoencoders (LAA) [34] : represents the labelling problem via an autoencoder model where  ... 
arXiv:1807.07291v2 fatcat:o3tnrc7vxvbg5eyh4mrgze4n5i

Universal Online Convex Optimization with Minimax Optimal Second-Order Dynamic Regret [article]

Hakan Gokcesu, Suleyman S. Kozat
2022 arXiv   pre-print
We introduce an online convex optimization algorithm which utilizes projected subgradient descent with optimal adaptive learning rates.  ...  We also derive the extension for learning in each decision coordinate individually.  ...  P (T ), via resetting the learning rates at critical rounds.  ... 
arXiv:1907.00497v3 fatcat:6vvf7bnxgzbjrd2efosevbgvvq

Constrained Policy Optimization for Controlled Self-Learning in Conversational AI Systems [article]

Mohammad Kachuee, Sungjin Lee
2022 arXiv   pre-print
In this study, we introduce a scalable framework for supporting fine-grained exploration targets for individual domains via user-defined constraints.  ...  Furthermore, we present a novel meta-gradient learning approach that is scalable and practical to address this problem.  ...  This method has four hyperparameters controlling the max player optimization via adjusting the update frequency, learning rate, and decay factors.  ... 
arXiv:2209.08429v1 fatcat:4gojyg43qfgi5noz2cn3dscozi

On Lower Bounds for Statistical Learning Theory

Po-Ling Loh
2017 Entropy  
We focus on the settings of parameter and function estimation, community recovery, and online learning for multi-armed bandits.  ...  This paper provides a survey of various techniques used to derive information-theoretic lower bounds for estimation and learning.  ...  Online Learning We now shift our focus to sequential allocation problems.  ... 
doi:10.3390/e19110617 fatcat:5r46ucrik5eubbcfab4aiklwee

Polynomial Methods in Statistical Inference: Theory and Practice

Yihong Wu, Pengkun Yang
2020 Foundations and Trends in Communications and Information Theory  
The effectiveness of the polynomial method is demonstrated in concrete problems such as entropy and support size estimation, distinct elements problem, and learning Gaussian mixture models.  ...  on large domains and learning mixture models.  ...  ISSN online version 1567-2328 . Also available as a combined paper and online subscription.  ... 
doi:10.1561/0100000095 fatcat:y5upn6t54jhtddx66tqrdbkkpi

Guest Editorial Special Issue on Distributed Learning Over Wireless Edge Networks—Part II

Mingzhe Chen, Deniz Gunduz, Kaibin Huang, Walid Saad, Mehdi Bennis, Aneta Vulgarakis Feljan, H. Vincent Poor
2022 IEEE Journal on Selected Areas in Communications  
In In [A8] , Lee et al. study schemes and lower bounds for distributed minimax estimation over a Gaussian multiple-access channel under squared error loss.  ...  Then, the authors derive information-theoretic lower bounds on the minimax risk of any estimation scheme that is restricted to communicate the samples over a given number of uses of the channel.  ... 
doi:10.1109/jsac.2021.3118515 fatcat:3vdy4tamrbccld55nllvbiux4i

Robust Reinforcement Learning: A Constrained Game-theoretic Approach

Jing Yu, Clement Gehring, Florian Schäfer, Animashree Anandkumar
2021 Conference on Learning for Dynamics & Control  
Reinforcement learning (RL) methods provide state-of-art performance in complex control tasks.  ...  We formulate robust RL as a constrained minimax game between the RL agent and an environmental agent which represents uncertainties such as model parameter variations and adversarial disturbances.  ...  Therefore, we present a gradient estimation result whose derivation is deferred to the Appendix in the full version of the paper online.  ... 
dblp:conf/l4dc/YuGSA21 fatcat:lnrakywdnzcgdpqa44w34cpp3e

Experience generalization for concurrent reinforcement learners

Carlos H. C. Ribeiro, Renê Pegoraro, Anna H. Reali Costa
2002 Proceedings of the first international joint conference on Autonomous agents and multiagent systems part 3 - AAMAS '02  
to the Minimax-Q algorithm.  ...  We investigate the use of experience generalization for increasing the rate of convergence of RL algorithms, and contribute a new learning algorithm, Minimax-QS, which incorporates experience generalization  ...  CONCLUSION In this paper we have contributed a Minimax-QS algorithm, in which a spreading function is used to improve online learning time of control policies in multi-agent systems.  ... 
doi:10.1145/545104.545106 fatcat:nsbqr6mjezh7fbgxyhttbgmrcy
« Previous Showing results 1 — 15 out of 3,280 results