9,933 Hits in 9.3 sec

Early Stopping in Deep Networks: Double Descent and How to Eliminate it [article]

Reinhard Heckel, Fatih Furkan Yilmaz
2020 arXiv   pre-print
network are learned at different epochs, and eliminating this by proper scaling of stepsizes can significantly improve the early stopping performance.  ...  Inspired by this theory, we study two standard convolutional networks empirically and show that eliminating epoch-wise double descent through adjusting stepsizes of different layers improves the early  ...  Heckel acknowledges support of the NVIDIA Corporation in form of a GPU, and would like to thank Fanny Yang and Alexandru Tifrea for discussions and helpful comments on this manuscript.  ... 
arXiv:2007.10099v2 fatcat:7jwetyt5nfcp3dypwq3azvygem

Doubly Sparsifying Network

Zhangyang Wang, Shuai Huang, Jiayu Zhou, Thomas S. Huang
2017 Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence  
We compare DSN against a few carefully-designed baselines, to verify its consistently superior performance in a wide range of settings.  ...  We propose the doubly sparsifying network (DSN), by drawing inspirations from the double sparsity model for dictionary learning.  ...  Number of layers k + 1 The last thing that we investigate is how well DSN and other methods can scale to deeper cases. We grow k from 1 to 6, resulting in 2 to 7-layer networks 2 .  ... 
doi:10.24963/ijcai.2017/421 dblp:conf/ijcai/WangHZH17 fatcat:7azl7evpivbnxlg5ju75pihxwy

Finite Versus Infinite Neural Networks: an Empirical Study [article]

Jaehoon Lee, Samuel S. Schoenholz, Jeffrey Pennington, Ben Adlam, Lechao Xiao, Roman Novak, Jascha Sohl-Dickstein
2020 arXiv   pre-print
on width in ways not captured by double descent phenomena; equivariance of CNNs is only beneficial for narrow networks far from the kernel regime.  ...  similarly to early stopping; floating point precision limits kernel performance beyond a critical dataset size; regularized ZCA whitening improves accuracy; finite network performance depends non-monotonically  ...  We are also grateful to Atish Agarwala and Gamaleldin Elsayed for providing valuable feedbacks on a draft.  ... 
arXiv:2007.15801v2 fatcat:6ervrlzxybgeteh4cpdytu3w2q

Adaptive Ensemble Prediction for Deep Neural Networks based on Confidence Level [article]

Hiroshi Inoue
2019 arXiv   pre-print
In this paper, we first describe our insights on the relationship between the probability of prediction and the effect of ensembling with current deep neural networks; ensembling does not help mispredictions  ...  threshold in addition to yielding a better accuracy with the same cost.  ...  and Statistics (AISTATS) 2019, Naha, Okinawa, Japan. or multi-core CPUs, makes it possible to train deep networks by using tremendously large datasets.  ... 
arXiv:1702.08259v3 fatcat:hkof5txo6bh4roatd5allqh5vi

When and how epochwise double descent happens [article]

Cory Stephenson, Tyler Lee
2021 arXiv   pre-print
Deep neural networks are known to exhibit a 'double descent' behavior as the number of parameters increases.  ...  Our findings indicate that epochwise double descent requires a critical amount of noise to occur, but above a second critical noise level early stopping remains effective.  ...  It is also shown that early stopping removes the double descent peak, similar to the results obtained for linear networks in [11, 12] .  ... 
arXiv:2108.12006v1 fatcat:yau6hjlqizb5xcuixonp7joetq

Cellular automata as convolutional neural networks [article]

William Gilpin
2018 arXiv   pre-print
Deep learning techniques have recently demonstrated broad success in predicting complex dynamical systems ranging from turbulence to human speech, motivating broader questions about how neural networks  ...  Our results suggest how the entropy of a physical process can affect its representation when learned by neural networks.  ...  This relationship has high variance in early layers, making it difficult to visually discern in the panel save for the last layer.  ... 
arXiv:1809.02942v1 fatcat:wwggh25xxbc5fpl5ehruhdgyoq

A type of generalization error induced by initialization in deep neural networks [article]

Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma
2020 arXiv   pre-print
How initialization and loss function affect the learning of a deep neural network (DNN), specifically its generalization error, is an important problem in practice.  ...  In this work, by exploiting the linearity of DNN training dynamics in the NTK regime , we provide an explicit and quantitative answer to this problem.  ...  Introduction The wide application of deep learning makes it increasingly urgent to establish quantitative theoretical understanding of the learning and generalization behaviors of deep neural networks  ... 
arXiv:1905.07777v3 fatcat:bp2vd3rrpvcuthpxuhh7htoipi

An Efficient Pest Classification In Smart Agriculture Using Transfer Learning

Tuan Nguyen, Quoc-Tuan Vien, Harin Sellahewa
2021 EAI Endorsed Transactions on Industrial Networks and Intelligent Systems  
To this day, agriculture still remains very important and plays considerable role to support our daily life and economy in most countries.  ...  In this paper, we introduce an efficient method basing on deep learning approach to classify pests from images captured from the crops.  ...  It is essential to control its outbreak. To do so, farmers need to scout their maize crop daily to be able to detect it early.  ... 
doi:10.4108/eai.26-1-2021.168227 fatcat:ixltqdlmdbcgvpwujosewojb7i

Morphological classification of radio galaxies: Capsule Networks versus Convolutional Neural Networks [article]

V. Lukic, M. Brüggen, B. Mingo, J. H. Croston, G. Kasieczka, P.N. Best
2019 arXiv   pre-print
Convolutional neural networks are the deep learning technique that has proven to be the most successful in classifying image data.  ...  The convolutional networks always outperform any variation of the capsule network, as they prove to be more robust to the presence of noise in images.  ...  This paper is based on data obtained with the International LOFAR Telescope (ILT) under project codes LC2 038 and LC3 008  ... 
arXiv:1905.03274v1 fatcat:uqpvducyebf7rotbvi7z2r7ixi

Network Horizon Dynamics I: Qualitative Aspects [article]

B. Dribus, A. Sumner, K. Bist, N. Regmi, J. Sircar, S. Upreti
2019 arXiv   pre-print
We also propose a small-world approach to the horizon problem in the cosmology of the early universe as a novel alternative to the inflationary hypothesis of Guth and Linde.  ...  We explain how such phase transitions distinguish deep neural networks from shallow machine learning architectures, and propose hybrid local/random network designs with surprising connectivity advantages  ...  Acknowledgements We thank Jalynn Roberts, Jessica Garriga, Thomas Naugle, Haley Dozier, Joshua Deaton, Lillie Blackmon, Madeline Leboeuf, and Stephanie Dribus for stimulating discussions and technical  ... 
arXiv:1903.10268v1 fatcat:xl5tn4skqnat5geauhkzn7xvje

Gas Classification Using Deep Convolutional Neural Networks

Pai Peng, Xiaojin Zhao, Xiaofang Pan, Wenbin Ye
2018 Sensors  
In this work, we propose a novel Deep Convolutional Neural Network (DCNN) tailored for gas classification.  ...  In general, the proposed gas neural network, named GasNet, consists of: six convolutional blocks, each block consist of six layers; a pooling layer; and a fully-connected layer.  ...  Wenbin Ye and Pai Peng contributed to the idea of the incentive mechanism and designed the algorithms.  ... 
doi:10.3390/s18010157 pmid:29316723 pmcid:PMC5795481 fatcat:jxdeyladcrhldkyuwfwmc7tqqi

Machine learning with neural networks [article]

B. Mehlig
2021 arXiv   pre-print
Lecture notes for my course on machine learning with neural networks that I have given at Gothenburg University and Chalmers Technical University in Gothenburg, Sweden.  ...  The tendency to overfit is larger for networks with more neurons. One way of avoiding overfitting is to use cross validation and early stopping.  ...  Early stopping caused the training of the larger network to abort after 135 epochs, this corresponds to 824 iterations.  ... 
arXiv:1901.05639v3 fatcat:pyyiywuoxzds5kyc6ohqtqtd3e

Eye-Tracking Signals Based Affective Classification Employing Deep Gradient Convolutional Neural Networks

Yuanfeng Li, Jiangang Deng, Qun Wu, Ying Wang
2021 International Journal of Interactive Multimedia and Artificial Intelligence  
This research aims to develop a deep gradient convolutional neural network (DGCNN) for classifying affection by using an eye-tracking signals.  ...  Customizing mini-batch, loss, learning rate, and gradients definition for the training structure of the deep neural network was also deployed finally.  ...  The current practice is to increase the training set, stop early, regularize, dropout, and improve network structure [27] .  ... 
doi:10.9781/ijimai.2021.06.002 fatcat:nl4jey3kdzhtphrxo55zzhwsvi

Population Based Training of Neural Networks [article]

Max Jaderberg, Valentin Dalibard, Simon Osindero, Wojciech M. Czarnecki, Jeff Donahue, Ali Razavi, Oriol Vinyals, Tim Green, Iain Dunning, Karen Simonyan, Chrisantha Fernando, Koray Kavukcuoglu
2017 arXiv   pre-print
Networks to maximise the Inception score of generated images.  ...  In all cases PBT results in the automatic discovery of hyperparameter schedules and model selection which results in stable training and better final performance.  ...  Acknowledgments We would like to thank Yaroslav Ganin, Mihaela Rosca, John Agapiou, Sasha Vehznevets, Vlad Firoiu, and the wider DeepMind team for many insightful discussions, ideas, and support.  ... 
arXiv:1711.09846v2 fatcat:d7akwgqzuvhw3m6z5hgngzqojy

Towards Evaluating the Robustness of Neural Networks

Nicholas Carlini, David Wagner
2017 2017 IEEE Symposium on Security and Privacy (SP)  
This makes it difficult to apply neural networks in security-critical areas.  ...  Unfortunately, neural networks are vulnerable to adversarial examples: given an input x and any target classification t, it is possible to find a new input x that is similar to x but classified as t.  ...  ACKNOWLEDGEMENTS We would like to thank Nicolas Papernot discussing our defensive distillation implementation, and the anonymous reviewers for their helpful feedback.  ... 
doi:10.1109/sp.2017.49 dblp:conf/sp/Carlini017 fatcat:wzvnhpyq3nc2dlmary26lvhwey
« Previous Showing results 1 — 15 out of 9,933 results