665 Hits in 5.7 sec

Acceleration and Averaging in Stochastic Mirror Descent Dynamics [article]

Walid Krichene, Peter L. Bartlett
2017 arXiv   pre-print
We formulate and study a general family of (continuous-time) stochastic dynamics for accelerated first-order minimization of smooth convex functions.  ...  Building on an averaging formulation of accelerated mirror descent, we propose a stochastic variant in which the gradient is contaminated by noise, and study the resulting stochastic differential equation  ...  Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS).  ... 
arXiv:1707.06219v1 fatcat:63we5k3tvrcc7cvyphxdgo56wi

A variational perspective on accelerated methods in optimization

Andre Wibisono, Ashia C. Wilson, Michael I. Jordan
2016 Proceedings of the National Academy of Sciences of the United States of America  
In 1983, Nesterov introduced acceleration in the context of gradient descent for convex functions (1), showing that it achieves an improved convergence rate with respect to gradient descent and moreover  ...  We provide a systematic methodology for converting accelerated higher-order methods from continuous time to discrete time.  ...  The case p = 2 of equation [13] is the continuous-time limit of Nesterov's accelerated mirror descent (11) , and the case p = 3 is the continuous-time limit of Nesterov's accelerated cubic-regularized  ... 
doi:10.1073/pnas.1614734113 pmid:27834219 pmcid:PMC5127379 fatcat:aw5cqsmzfvfkpnla2bte4ywnsy

Stochastic Mirror Descent for Convex Optimization with Consensus Constraints [article]

Anastasia Borovykh, Nikolas Kantas, Panos Parpas, Grigorios A. Pavliotis
2022 arXiv   pre-print
In this paper we propose and study exact distributed mirror descent algorithms in continuous-time under additive noise and present the settings that enable linear convergence rates.  ...  The mirror descent algorithm is known to be effective in applications where it is beneficial to adapt the mirror map to the underlying geometry of the optimization model.  ...  Main results and contributions Our results are based on a continuous-time analysis of stochastic mirror descent dynamics.  ... 
arXiv:2201.08642v2 fatcat:b5q55futazalneqwxzo2ofzz4y

A Continuized View on Nesterov Acceleration for Stochastic Gradient Descent and Randomized Gossip [article]

Mathieu Even, Raphaël Berthier, Francis Bach, Nicolas Flammarion, Pierre Gaillard, Hadrien Hendrikx, Laurent Massoulié, Adrien Taylor
2021 arXiv   pre-print
for the parameters; and a discretization of the continuized process can be computed exactly with convergence rates similar to those of Nesterov original acceleration.  ...  Finally, using our continuized framework and expressing the gossip averaging problem as the stochastic minimization of a certain energy function, we provide the first rigorous acceleration of asynchronous  ...  Acknowledgements: The authors thank Sam Power for pointing out the class of piecewise deterministic Markov processes and related references.  ... 
arXiv:2106.07644v2 fatcat:6dmvas3rp5byff3cdzsnmlmenu

Stochastic Approximation versus Sample Average Approximation for population Wasserstein barycenters [article]

Darina Dvinskikh
2021 arXiv   pre-print
In the machine learning and optimization community, there are two main approaches for the convex risk minimization problem, namely, the Stochastic Approximation (SA) and the Sample Average Approximation  ...  The preliminary results are derived for a general convex optimization problem given by the expectation in order to have other applications besides the Wasserstein barycenter problem.  ...  Acknowledgements The work was supported by the Russian Science Foundation (project 18-71-10108),; and by the Ministry of Science and  ... 
arXiv:2001.07697v9 fatcat:n3yq27j67fgrbhkvohsdtyezau

On stochastic mirror descent with interacting particles: convergence properties and variance reduction [article]

Anastasia Borovykh, Nikolas Kantas, Panos Parpas, Grigorios A. Pavliotis
2020 arXiv   pre-print
We study the convergence of stochastic mirror descent and make explicit the tradeoffs between communication and variance reduction.  ...  To address this question, we reduce the problem of the computation of an exact minimizer with noisy gradient information to the study of stochastic mirror descent with interacting particles.  ...  Mirror Descent in Discrete and Continuous Time A generalization of the projected gradient descent method is mirror descent, which is given by: ∇Φ(y t+1 ) = ∇Φ(x t ) − η∇f (x t ), x t+1 = Π Φ X (y t+1 )  ... 
arXiv:2007.07704v2 fatcat:osprl6goprfbhcuttpqgvhtnfq

Proximal Reinforcement Learning: A New Theory of Sequential Decision Making in Primal-Dual Spaces [article]

Sridhar Mahadevan, Bo Liu, Philip Thomas, Will Dabney, Steve Giguere, Nicholas Jacek, Ian Gemp, Ji Liu
2014 arXiv   pre-print
of mirror descent methods.  ...  This key technical innovation makes it possible to finally design "true" stochastic gradient methods for reinforcement learning.  ...  Principal funding for this research was provided by the National Science Foundation under the grant NSF IIS-1216467.  ... 
arXiv:1405.6757v1 fatcat:u77kqc6iyncy7fixlnrfcnqrmy

A Lyapunov Analysis of Momentum Methods in Optimization [article]

Ashia C. Wilson, Benjamin Recht, Michael I. Jordan
2018 arXiv   pre-print
We show there is an equivalence between the technique of estimate sequences and a family of Lyapunov functions in both continuous and discrete time.  ...  This connection allows us to develop a simple and unified analysis of many existing momentum algorithms, introduce several new algorithms, and strengthen the connection between algorithms and continuous-time  ...  Acknowledgements We would like to give special thanks to Andre Wibisono as well as Orianna Demassi and Stephen Tu for the many helpful discussions involving this paper.  ... 
arXiv:1611.02635v4 fatcat:kzykojk24bddjatbvdiiw5uj4m

Accelerated First-Order Methods: Differential Equations and Lyapunov Functions [article]

Jonathan W. Siegel
2021 arXiv   pre-print
We develop a theory of accelerated first-order optimization from the viewpoint of differential equations and Lyapunov functions.  ...  Our main contributions are to provide a general framework for discretizating the differential equations to produce accelerated methods, and to provide physical intuition which helps explain the optimal  ...  Acknowledgements We would like to thank Professors Russel Caflisch, Stanley Osher, and Jinchao Xu for their helpful suggestions and comments.  ... 
arXiv:1903.05671v6 fatcat:tp2i2egdc5akrbq64fyalufzky

Enhanced Bilevel Optimization via Bregman Distance [article]

Feihu Huang, Junyi Li, Heng Huang
2022 arXiv   pre-print
, and the inner subproblem is strongly convex.  ...  Meanwhile, we also propose a stochastic bilevel optimization method (SBiO-BreD) to solve stochastic bilevel problems based on stochastic approximated gradients and Bregman distance.  ...  [Beck and Teboulle, 2003 ] studied the mirror descent for convex optimization. subsequently, Duchi et al. [2010] proposed an effective variant of mirror descent, i.e., composite objective mirror descent  ... 
arXiv:2107.12301v2 fatcat:lvk5tbw65jgjharydo3wrgjeje

Restarting Algorithms: Sometimes there is Free Lunch [article]

Sebastian Pokutta
2020 arXiv   pre-print
We will review restarts in various settings from continuous optimization, discrete optimization, and submodular function maximization where they have delivered impressive results.  ...  Restarts are widely used in different fields and have become a powerful tool to leverage additional information that has not been directly incorporated in the base algorithm or argument.  ...  Acknowledgement We would like to thank Gábor Braun and Marc Pfetsch for helpful comments and feedback on an earlier version of this article.  ... 
arXiv:2006.14810v1 fatcat:caoevby75jfk3fd2g3mjkh5zpa

Linear Coupling: An Ultimate Unification of Gradient and Mirror Descent [article]

Zeyuan Allen-Zhu, Lorenzo Orecchia
2016 arXiv   pre-print
mirror descent, which yields dual progress.  ...  We observe that the performances of gradient and mirror descent are complementary, so that faster algorithms can be designed by LINEARLY COUPLING the two.  ...  Acknowledgements We thank Jon Kelner and Yin Tat Lee for helpful conversations, and Aaditya Ramdas for pointing out a typo in the previous version of this paper.  ... 
arXiv:1407.1537v5 fatcat:2tuq56roj5ejbabswskugb4eze

Statistical inference for the population landscape via moment-adjusted stochastic gradients

Tengyuan Liang, Weijie J. Su
2019 Journal of The Royal Statistical Society Series B-statistical Methodology  
On the statistical front, the theory allows for model mis-specification, with very mild conditions on the data. For optimization, the theory is flexible for both convex and non-convex cases.  ...  This paper makes progress along this direction by introducing the moment-adjusted stochastic gradient descents, a new stochastic optimization method for statistical inference.  ...  algorithm attains optimal acceleration of convergence rate for a strongly convex L (see Polyak and Juditsky [1992] for more details).  ... 
doi:10.1111/rssb.12313 fatcat:6xuoplsqerd33nv54zgimnmm54

Decentralized Algorithms for Wasserstein Barycenters [article]

Darina Dvinskikh
2021 arXiv   pre-print
In this thesis, we consider the Wasserstein barycenter problem of discrete probability measures from computational and statistical sides.  ...  The motivation for dual approaches is closed-forms for the dual formulation of entropy-regularized Wasserstein distances and their derivatives, whereas the primal formulation has closed-form expression  ...  strongly convex functions with Lipschitz continuous gradients.  ... 
arXiv:2105.01587v2 fatcat:jt2335pxozduxhgim2au7x7eha

A Continuous-Time Nesterov Accelerated Gradient Method for Centralized and Distributed Online Convex Optimization [article]

Chao Sun, Guoqiang Hu
2020 arXiv   pre-print
This paper studies the online convex optimization problem by using an Online Continuous-Time Nesterov Accelerated Gradient method (OCT-NAG).  ...  the objective functions and optimal solutions hold.  ...  and the regret bound can be relaxed to O(log(T )) for strongly convex functions.  ... 
arXiv:2009.12545v1 fatcat:g27nvyvb3jge5obzqj5kdh2qai
« Previous Showing results 1 — 15 out of 665 results