A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is
Memorization is worst-case generalization. Based on MacKay's information theoretic model of supervised machine learning, this article discusses how to practically estimate the maximum size of a neural network given a training data set. First, we present four easily applicable rules to analytically determine the capacity of neural network architectures. This allows the comparison of the efficiency of different network architectures independently of a task. Second, we introduce and experimentallyarXiv:1810.02328v1 fatcat:f4zh2nkitrhoff6oun7woe6ua4