1,074 Hits in 6.4 sec

Gradient Descent Finds the Cubic-Regularized Non-Convex Newton Step [article]

Yair Carmon, John C. Duchi
2022 arXiv   pre-print
When we use gradient descent to approximate the cubic-regularized Newton step, our result implies a rate of convergence to second-order stationary points of general smooth non-convex functions.  ...  We consider the minimization of non-convex quadratic forms regularized by a cubic term, which exhibit multiple saddle points and poor local minima.  ...  YC was partially supported by the Stanford Graduate Fellowship and the Numerical Technologies Fellowship. JCD was partially supported by the National Science Foundation award NSF-CAREER-1553086.  ... 
arXiv:1612.00547v3 fatcat:acu2bcrufngqpcrasog6a25nz4

SingCubic: Cyclic Incremental Newton-type Gradient Descent with Cubic Regularization for Non-Convex Optimization [article]

Ziqiang Shi
2020 arXiv   pre-print
The results and technique can be served as an initiate for the research on the incremental Newton-type gradient descent methods that employ cubic regularization.  ...  In this work, we generalized and unified two recent completely different works of and respectively into one by proposing the cyclic incremental Newton-type gradient descent with cubic regularization (SingCubic  ...  Conclusions This paper introduces a novel cyclic incremental Newton-type gradient descent with cubic regularization method called SingCubic for minimizing non-convex finite sums.  ... 
arXiv:2002.06848v1 fatcat:dspflouj35ct5lx4p2lvbygd4e

Forward-backward splitting in deformable image registration: A demons approach

Michael Ebner, Marc Modat, Sebastiano Ferraris, Sebastien Ourselin, Tom Vercauteren
2018 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018)  
and convex regularization associated with a tractable proximal operator.  ...  This interpretation introduces a parallel to the more general Forward-Backward Splitting (FBS) scheme consisting of a forward gradient descent and proximal step.  ...  simple, iterative steps: a forward gradient descent step on f and a so-called proximal, backward gradient descent step.  ... 
doi:10.1109/isbi.2018.8363755 dblp:conf/isbi/EbnerMFOV18 fatcat:372hiicblbdfxcyrnagm62c7em

Stochastic Subspace Cubic Newton Method [article]

Filip Hanzely, Nikita Doikov, Peter Richtárik, Yurii Nesterov
2020 arXiv   pre-print
We prove that as we vary the minibatch size, the global convergence rate of SSCN interpolates between the rate of stochastic coordinate descent (CD) and the rate of cubic regularized Newton, thus giving  ...  Our method can be seen both as a stochastic extension of the cubically-regularized Newton method of Nesterov and Polyak (2006), and a second-order enhancement of stochastic subspace descent of Kozak et  ...  Acknowledgements The work of the second and the fourth author was supported by ERC Advanced Grant 788368.  ... 
arXiv:2002.09526v1 fatcat:yvk6lbawmrexfksvg4p5sy2fuq

Cubic Regularization with Momentum for Nonconvex Optimization [article]

Zhe Wang, Yi Zhou, Yingbin Liang, Guanghui Lan
2019 arXiv   pre-print
Our numerical experiments on various nonconvex optimization problems demonstrate that the momentum scheme can substantially facilitate the convergence of cubic regularization, and perform even better than  ...  However, such a successful acceleration technique has not yet been proposed for second-order algorithms in nonconvex optimization.In this paper, we apply the momentum scheme to cubic regularized (CR) Newton's  ...  Gradient descent efficiently finds the cubic-regularized non-convex Newton step. arXiv:1803.09357.Carmon, Y., Duchi, J. C.,Hinder, O., and Sidford, A. (2016).  ... 
arXiv:1810.03763v2 fatcat:cyw6ycinu5bblnxel4ytqxc4om

Sub-sampled Cubic Regularization for Non-convex Optimization [article]

Jonas Moritz Kohler, Aurelien Lucchi
2017 arXiv   pre-print
To the best of our knowledge this is the first work that gives global convergence guarantees for a sub-sampled variant of cubic regularization on non-convex functions.  ...  We consider the minimization of non-convex functions that typically arise in machine learning. Specifically, we focus our attention on a variant of trust region methods known as cubic regularization.  ...  Gradient descent efficiently finds the cubic-regularized non-convex newton step., 2016.  ... 
arXiv:1705.05933v3 fatcat:zx2rrkfgsfhfhd2vhjipljquvi

Accelerated Methods for Non-Convex Optimization [article]

Yair Carmon, John C. Duchi, Oliver Hinder, Aaron Sidford
2017 arXiv   pre-print
We present an accelerated gradient method for non-convex optimization problems with Lipschitz continuous first and second derivatives.  ...  The method improves upon the O(ϵ^-2 ) complexity of gradient descent and provides the additional second-order guarantee that ∇^2 f(x) ≽ -O(ϵ^1/2)I for the computed x.  ...  YC was partially supported by the Stanford Graduate Fellowship and the Numerical Technologies Fellowship. JCD was partially supported by the National Science Foundation award NSF-CAREER-1553086  ... 
arXiv:1611.00756v2 fatcat:g4ukocscdzdv5lkl5dz4lqpco4

Stochastic Cubic Regularization for Fast Nonconvex Optimization [article]

Nilesh Tripuraneni, Mitchell Stern, Chi Jin, Jeffrey Regier, Michael I. Jordan
2017 arXiv   pre-print
This paper proposes a stochastic variant of a classic algorithm---the cubic-regularized Newton method [Nesterov and Polyak 2006].  ...  The latter can be computed as efficiently as stochastic gradients. This improves upon the Õ(ϵ^-4) rate of stochastic gradient descent.  ...  Whereas gradient descent finds the minimizer of a local second-order Taylor expansion, x GD t+1 = argmin x f (x t ) + ∇f (x t ) (x − x t ) + 2 x − x t 2 , the cubic regularized Newton method finds the  ... 
arXiv:1711.02838v2 fatcat:3jbqlymnkja4zmb2rdam6hhkke

First-Order Methods for Nonconvex Quadratic Minimization [article]

Yair Carmon, John C. Duchi
2020 arXiv   pre-print
Despite the nonconvexity of these problems we prove that, under mild assumptions, gradient descent converges to their global solutions, and give a non-asymptotic rate of convergence for the cubic variant  ...  When we use Krylov subspace solutions to approximate the cubic-regularized Newton step, our results recover the strongest known convergence guarantees to approximate second-order stationary points of general  ...  YC was partially supported by the Stanford Graduate Fellowship and the Numerical Technologies Fellowship. JCD was partially supported by the National Science Foundation award NSF-CAREER-1553086.  ... 
arXiv:2003.04546v1 fatcat:jje2smtw6rdrloc22zu7dnyfim

On Noisy Negative Curvature Descent: Competing with Gradient Descent for Faster Non-convex Optimization [article]

Mingrui Liu, Tianbao Yang
2017 arXiv   pre-print
The key building block of the proposed algorithms is a novel updating step named the NCG step, which lets a noisy negative curvature descent compete with the gradient descent.  ...  In this paper, we propose to further reduce the number of Hessian-vector products for faster non-convex optimization.  ...  analyzed a gradient descent method for solving the cubic regularization step of (Nesterov and Polyak, 2006) .  ... 
arXiv:1709.08571v2 fatcat:ndzw4nwsibaapkenhcdp7ozktm

BILGO: Bilateral greedy optimization for large scale semidefinite programming

Zhifeng Hao, Ganzhao Yuan, Bernard Ghanem
2014 Neurocomputing  
The algorithm thus successfully combines the efficiency of conventional rank-1 update algorithms and the effectiveness of gradient descent.  ...  the leading eigenvector of the descent direction at this iteration.  ...  This is because with a descent direction D in Theorem 1, the convex optimization will find a non-negative step length π(π ≥ 0) which will naturally leads to α = 1 − π ≤ 1 (refer to the experiment in Section  ... 
doi:10.1016/j.neucom.2013.07.024 fatcat:vw6tiehxljanvgdkzlps7i4wuy

Randomized Block Cubic Newton Method [article]

Nikita Doikov, Peter Richtárik
2018 arXiv   pre-print
We study the problem of minimizing the sum of three convex functions: a differentiable, twice-differentiable and a non-smooth term in a high dimensional setting.  ...  components: a linear model with a quadratic regularizer for the differentiable term, a quadratic model with a cubic regularizer for the twice differentiable term, and perfect (proximal) model for the  ...  Since some matrix factorization is used, the cost of the cubically regularized Newton step is actually similar by efficiency to the classical Newton one.  ... 
arXiv:1802.04084v2 fatcat:jstfxlgykjdbfbmxfe3iis5yfy

Optimization Methods for Inverse Problems [article]

Nan Ye and Farbod Roosta-Khorasani and Tiangang Cui
2017 arXiv   pre-print
In this light, the mere non-linear, non-convex, and large-scale nature of many of these inversions gives rise to some very challenging optimization problems.  ...  By highlighting the similarities among the optimization challenges faced by the inverse problem and the machine learning communities, we hope that this survey can serve as a bridge in bringing together  ...  In addition, in non-convex problems, classical Newton direction might not exist (if the Hessian matrix is not invertible) or it might not be an appropriate direction for descent (if the Hessian matrix  ... 
arXiv:1712.00154v1 fatcat:u4rhjnzzw5etje3f55jw6c4kom

Optimizing Costly Functions with Simple Constraints: A Limited-Memory Projected Quasi-Newton Algorithm

Mark Schmidt, Ewout van den Berg, Michael P. Friedlander, Kevin P. Murphy
2009 Journal of machine learning research  
Numerical experiments on one-norm regularized test problems indicate that the proposed method is competitive with state-of-the-art methods such as boundconstrained L-BFGS and orthant-wise descent.  ...  The quadratic approximation is constructed using a limited-memory quasi-Newton update.  ...  Also, one of the most effective solvers for (non-differentiable) 1 -regularized optimization problems is also an extension of the L-BFGS method, known as orthant-wise descent (OWD) (Andrew and Gao, 2007  ... 
dblp:journals/jmlr/SchmidtBFM09 fatcat:eossvl7fezd27cft5pueelj7q4

Efficient Regret Minimization in Non-Convex Games [article]

Elad Hazan, Karan Singh, Cyril Zhang
2017 arXiv   pre-print
We consider regret minimization in repeated games with non-convex loss functions. Minimizing the standard notion of regret is computationally intractable.  ...  We give gradient-based methods that achieve optimal regret, which in turn guarantee convergence to equilibrium in this framework.  ...  This entails replacing the gradient descent epochs with a cubic-regularized Newton method [NP06, AAZB + 16] .  ... 
arXiv:1708.00075v1 fatcat:2jzllz4llrfvnig4djc3zvbbuy
« Previous Showing results 1 — 15 out of 1,074 results