A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is
We initiate the study of the inherent tradeoffs between the size of a neural network and its robustness, as measured by its Lipschitz constant. We make a precise conjecture that, for any Lipschitz activation function and for most datasets, any two-layers neural network with k neurons that perfectly fit the data must have its Lipschitz constant larger (up to a constant) than √(n/k) where n is the number of datapoints. In particular, this conjecture implies that overparametrization is necessaryarXiv:2009.14444v2 fatcat:iqcj6heukjbcpmsjfjop7zbipu