A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent
[article]
2017
arXiv
pre-print
Nesterov's accelerated gradient descent (AGD), an instance of the general family of "momentum methods", provably achieves faster convergence rate than gradient descent (GD) in the convex setting. However, whether these methods are superior to GD in the nonconvex setting remains open. This paper studies a simple variant of AGD, and shows that it escapes saddle points and finds a second-order stationary point in Õ(1/ϵ^7/4) iterations, faster than the Õ(1/ϵ^2) iterations required by GD. To the
arXiv:1711.10456v1
fatcat:pkiddxkz6nfwzenzxqqptcrluu