A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Algorithm-Dependent Generalization Bounds for Overparameterized Deep Residual Networks
[article]
2019
arXiv
pre-print
The skip-connections used in residual networks have become a standard architecture choice in deep learning due to the increased training stability and generalization performance with this architecture, although there has been limited theoretical understanding for this improvement. In this work, we analyze overparameterized deep residual networks trained by gradient descent following random initialization, and demonstrate that (i) the class of networks learned by gradient descent constitutes a
arXiv:1910.02934v1
fatcat:q5y7ulwlnvgqpav7txyyg6vciq