A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2016; you can also visit the original URL.
The file type is application/pdf
.
Using Curvature Information for Fast Stochastic Search
1996
Neural Information Processing Systems
We present an algorithm for fast stochastic gradient descent that uses a nonlinear adaptive momentum scheme to optimize the late time convergence rate. The algorithm makes effective use of curvature information, requires only O(n) storage and computation, and delivers convergence rates close to the theoretical optimum. We demonstrate the technique on linear and large nonlinear backprop networks.
dblp:conf/nips/OrrL96
fatcat:uiwsuxbydnhgjffwdtz37icxru