Training feedforward neural networks using orthogonal iteration of the Hessian eigenvectors

A. Hunter
2000 Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium  
Second order training algorithms are based upon a local quadratic approximation of the error surface [21. Given a quadratic error function, the error surface has hyper-ellipsoid contours of equal error. The axes are aligned with the eigenvectors of the Hessian, ej, with the length of each axis inversely proportional to the comespon~lng eigenv~ue, , Gradient descent is unacceptably slow since the grtilent vector, -g, tends to point across the hyper-ellipsoid in the direction of axes with large
more » ... f axes with large eigenwdues, and convergence speed is limited by the condition number of the Hessian, & /Aw. In contrast to the gradient vector, the Newton direction, -Hg, points directly to the minimum,
doi:10.1109/ijcnn.2000.857893 dblp:conf/ijcnn/Hunter00 fatcat:uzdcxvzkyrcgbpnn4rrj5gzeti