A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Early Stopping in Deep Networks: Double Descent and How to Eliminate it
[article]
2020
arXiv
pre-print
network are learned at different epochs, and eliminating this by proper scaling of stepsizes can significantly improve the early stopping performance. ...
Inspired by this theory, we study two standard convolutional networks empirically and show that eliminating epoch-wise double descent through adjusting stepsizes of different layers improves the early ...
Heckel acknowledges support of the NVIDIA Corporation in form of a GPU, and would like to thank Fanny Yang and Alexandru Tifrea for discussions and helpful comments on this manuscript. ...
arXiv:2007.10099v2
fatcat:7jwetyt5nfcp3dypwq3azvygem
Doubly Sparsifying Network
2017
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
We compare DSN against a few carefully-designed baselines, to verify its consistently superior performance in a wide range of settings. ...
We propose the doubly sparsifying network (DSN), by drawing inspirations from the double sparsity model for dictionary learning. ...
Number of layers k + 1 The last thing that we investigate is how well DSN and other methods can scale to deeper cases. We grow k from 1 to 6, resulting in 2 to 7-layer networks 2 . ...
doi:10.24963/ijcai.2017/421
dblp:conf/ijcai/WangHZH17
fatcat:7azl7evpivbnxlg5ju75pihxwy
Finite Versus Infinite Neural Networks: an Empirical Study
[article]
2020
arXiv
pre-print
on width in ways not captured by double descent phenomena; equivariance of CNNs is only beneficial for narrow networks far from the kernel regime. ...
similarly to early stopping; floating point precision limits kernel performance beyond a critical dataset size; regularized ZCA whitening improves accuracy; finite network performance depends non-monotonically ...
We are also grateful to Atish Agarwala and Gamaleldin Elsayed for providing valuable feedbacks on a draft. ...
arXiv:2007.15801v2
fatcat:6ervrlzxybgeteh4cpdytu3w2q
Adaptive Ensemble Prediction for Deep Neural Networks based on Confidence Level
[article]
2019
arXiv
pre-print
In this paper, we first describe our insights on the relationship between the probability of prediction and the effect of ensembling with current deep neural networks; ensembling does not help mispredictions ...
threshold in addition to yielding a better accuracy with the same cost. ...
and Statistics (AISTATS) 2019, Naha, Okinawa, Japan. or multi-core CPUs, makes it possible to train deep networks by using tremendously large datasets. ...
arXiv:1702.08259v3
fatcat:hkof5txo6bh4roatd5allqh5vi
When and how epochwise double descent happens
[article]
2021
arXiv
pre-print
Deep neural networks are known to exhibit a 'double descent' behavior as the number of parameters increases. ...
Our findings indicate that epochwise double descent requires a critical amount of noise to occur, but above a second critical noise level early stopping remains effective. ...
It is also shown that early stopping removes the double descent peak, similar to the results obtained for linear networks in [11, 12] . ...
arXiv:2108.12006v1
fatcat:yau6hjlqizb5xcuixonp7joetq
Cellular automata as convolutional neural networks
[article]
2018
arXiv
pre-print
Deep learning techniques have recently demonstrated broad success in predicting complex dynamical systems ranging from turbulence to human speech, motivating broader questions about how neural networks ...
Our results suggest how the entropy of a physical process can affect its representation when learned by neural networks. ...
This relationship has high variance in early layers, making it difficult to visually discern in the panel save for the last layer. ...
arXiv:1809.02942v1
fatcat:wwggh25xxbc5fpl5ehruhdgyoq
A type of generalization error induced by initialization in deep neural networks
[article]
2020
arXiv
pre-print
How initialization and loss function affect the learning of a deep neural network (DNN), specifically its generalization error, is an important problem in practice. ...
In this work, by exploiting the linearity of DNN training dynamics in the NTK regime , we provide an explicit and quantitative answer to this problem. ...
Introduction The wide application of deep learning makes it increasingly urgent to establish quantitative theoretical understanding of the learning and generalization behaviors of deep neural networks ...
arXiv:1905.07777v3
fatcat:bp2vd3rrpvcuthpxuhh7htoipi
An Efficient Pest Classification In Smart Agriculture Using Transfer Learning
2021
EAI Endorsed Transactions on Industrial Networks and Intelligent Systems
To this day, agriculture still remains very important and plays considerable role to support our daily life and economy in most countries. ...
In this paper, we introduce an efficient method basing on deep learning approach to classify pests from images captured from the crops. ...
It is essential to control its outbreak. To do so, farmers need to scout their maize crop daily to be able to detect it early. ...
doi:10.4108/eai.26-1-2021.168227
fatcat:ixltqdlmdbcgvpwujosewojb7i
Morphological classification of radio galaxies: Capsule Networks versus Convolutional Neural Networks
[article]
2019
arXiv
pre-print
Convolutional neural networks are the deep learning technique that has proven to be the most successful in classifying image data. ...
The convolutional networks always outperform any variation of the capsule network, as they prove to be more robust to the presence of noise in images. ...
This paper is based on data obtained with the International LOFAR Telescope (ILT) under project codes LC2 038 and LC3 008 ...
arXiv:1905.03274v1
fatcat:uqpvducyebf7rotbvi7z2r7ixi
Network Horizon Dynamics I: Qualitative Aspects
[article]
2019
arXiv
pre-print
We also propose a small-world approach to the horizon problem in the cosmology of the early universe as a novel alternative to the inflationary hypothesis of Guth and Linde. ...
We explain how such phase transitions distinguish deep neural networks from shallow machine learning architectures, and propose hybrid local/random network designs with surprising connectivity advantages ...
Acknowledgements We thank Jalynn Roberts, Jessica Garriga, Thomas Naugle, Haley Dozier, Joshua Deaton, Lillie Blackmon, Madeline Leboeuf, and Stephanie Dribus for stimulating discussions and technical ...
arXiv:1903.10268v1
fatcat:xl5tn4skqnat5geauhkzn7xvje
Gas Classification Using Deep Convolutional Neural Networks
2018
Sensors
In this work, we propose a novel Deep Convolutional Neural Network (DCNN) tailored for gas classification. ...
In general, the proposed gas neural network, named GasNet, consists of: six convolutional blocks, each block consist of six layers; a pooling layer; and a fully-connected layer. ...
Wenbin Ye and Pai Peng contributed to the idea of the incentive mechanism and designed the algorithms. ...
doi:10.3390/s18010157
pmid:29316723
pmcid:PMC5795481
fatcat:jxdeyladcrhldkyuwfwmc7tqqi
Machine learning with neural networks
[article]
2021
arXiv
pre-print
Lecture notes for my course on machine learning with neural networks that I have given at Gothenburg University and Chalmers Technical University in Gothenburg, Sweden. ...
The tendency to overfit is larger for networks with more neurons. One way of avoiding overfitting is to use cross validation and early stopping. ...
Early stopping caused the training of the larger network to abort after 135 epochs, this corresponds to 824 iterations. ...
arXiv:1901.05639v3
fatcat:pyyiywuoxzds5kyc6ohqtqtd3e
Eye-Tracking Signals Based Affective Classification Employing Deep Gradient Convolutional Neural Networks
2021
International Journal of Interactive Multimedia and Artificial Intelligence
This research aims to develop a deep gradient convolutional neural network (DGCNN) for classifying affection by using an eye-tracking signals. ...
Customizing mini-batch, loss, learning rate, and gradients definition for the training structure of the deep neural network was also deployed finally. ...
The current practice is to increase the training set, stop early, regularize, dropout, and improve network structure [27] . ...
doi:10.9781/ijimai.2021.06.002
fatcat:nl4jey3kdzhtphrxo55zzhwsvi
Population Based Training of Neural Networks
[article]
2017
arXiv
pre-print
Networks to maximise the Inception score of generated images. ...
In all cases PBT results in the automatic discovery of hyperparameter schedules and model selection which results in stable training and better final performance. ...
Acknowledgments We would like to thank Yaroslav Ganin, Mihaela Rosca, John Agapiou, Sasha Vehznevets, Vlad Firoiu, and the wider DeepMind team for many insightful discussions, ideas, and support. ...
arXiv:1711.09846v2
fatcat:d7akwgqzuvhw3m6z5hgngzqojy
Towards Evaluating the Robustness of Neural Networks
2017
2017 IEEE Symposium on Security and Privacy (SP)
This makes it difficult to apply neural networks in security-critical areas. ...
Unfortunately, neural networks are vulnerable to adversarial examples: given an input x and any target classification t, it is possible to find a new input x that is similar to x but classified as t. ...
ACKNOWLEDGEMENTS We would like to thank Nicolas Papernot discussing our defensive distillation implementation, and the anonymous reviewers for their helpful feedback. ...
doi:10.1109/sp.2017.49
dblp:conf/sp/Carlini017
fatcat:wzvnhpyq3nc2dlmary26lvhwey
« Previous
Showing results 1 — 15 out of 9,933 results