EigenNet: Towards Fast and Structural Learning of Deep Neural Networks

Ping Luo
2017 Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence  
Deep Neural Network (DNN) is difficult to train and easy to overfit in training. We address these two issues by introducing EigenNet, an architecture that not only accelerates training but also adjusts number of hidden neurons to reduce over-fitting. They are achieved by whitening the information flows of DNNs and removing those eigenvectors that may capture noises. The former improves conditioning of the Fisher information matrix, whilst the latter increases generalization capability. These
more » ... ealing properties of EigenNet can benefit many recent DNN structures, such as network in network and inception, by wrapping their hidden layers into the layers of EigenNet. The modeling capacities of the original networks are preserved. Both the training wall-clock time and number of updates are reduced by using EigenNet, compared to stochastic gradient descent on various datasets, including MNIST, CIFAR-10, and CIFAR-100.
doi:10.24963/ijcai.2017/338 dblp:conf/ijcai/Luo17 fatcat:7esmnhdfifcz7pmxsif7iczkma