A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Taylorized Training: Towards Better Approximation of Neural Network Training at Finite Width
[article]
2020
arXiv
pre-print
We propose Taylorized training as an initiative towards better understanding neural network training at finite width. Taylorized training involves training the k-th order Taylor expansion of the neural network at initialization, and is a principled extension of linearized training—a recently proposed theory for understanding the success of deep learning. We experiment with Taylorized training on modern neural network architectures, and show that Taylorized training (1) agrees with full neural
arXiv:2002.04010v2
fatcat:hsn6b3qoxndj3irwb6wh7puxum