A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
With Greater Distance Comes Worse Performance: On the Perspective of Layer Utilization and Model Generalization
[article]
2022
arXiv
pre-print
Generalization of deep neural networks remains one of the main open problems in machine learning. Previous theoretical works focused on deriving tight bounds of model complexity, while empirical works revealed that neural networks exhibit double descent with respect to both training sample counts and the neural network size. In this paper, we empirically examined how different layers of neural networks contribute differently to the model; we found that early layers generally learn
arXiv:2201.11939v1
fatcat:pwmlbhb2ubaihelkdzwlnmzsj4