A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Regularizing activations in neural networks via distribution matching with the Wasserstein metric
[article]
2020
arXiv
pre-print
Regularization and normalization have become indispensable components in training deep neural networks, resulting in faster training and improved generalization performance. We propose the projected error function regularization loss (PER) that encourages activations to follow the standard normal distribution. PER randomly projects activations onto one-dimensional space and computes the regularization loss in the projected space. PER is similar to the Pseudo-Huber loss in the projected space,
arXiv:2002.05366v2
fatcat:c5iij45n4bg2fob6beffmu475i