98 Hits in 6.5 sec

Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting [article]

Zeke Xie, Fengxiang He, Shaopeng Fu, Issei Sato, Dacheng Tao, Masashi Sugiyama
2021 arXiv   pre-print
Deep learning is often criticized by two serious issues which rarely exist in natural nervous systems: overfitting and catastrophic forgetting.  ...  The empirical studies demonstrate that NVRM can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs. [Code: .]  ...  Acknowledgement MS was supported by the International Research Center for Neurointelligence (WPI-IRCN) at The University of Tokyo Institutes for Advanced Study.  ... 
arXiv:2011.06220v3 fatcat:qgdq5w7cdfghraoh5sblkuteh4

Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting

Zeke Xie, Fengxiang He, Shaopeng Fu, Issei Sato, Dacheng Tao, Masashi Sugiyama
2021 Neural Computation  
Deep learning is often criticized by two serious issues that rarely exist in natural nervous systems: overfitting and catastrophic forgetting.  ...  The empirical studies demonstrate that NVRM can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.  ...  In section 2, we propose the neural variability theory and mathematically validate that ANV relieves overfitting, label noise memorization, and catastrophic forgetting.  ... 
doi:10.1162/neco_a_01403 pmid:34310675 fatcat:u5tzeigrzrhsxjgozxrpbjgszm

Hierarchical Indian Buffet Neural Networks for Bayesian Continual Learning [article]

Samuel Kessler, Vu Nguyen, Stefan Zohren, Stephen Roberts
2021 arXiv   pre-print
As we automatically learn the number of weights in each layer of the BNN, overfitting and underfitting problems are largely overcome.  ...  We place an Indian Buffet process (IBP) prior over the structure of a Bayesian Neural Network (BNN), thus allowing the complexity of the BNN to increase and decrease automatically.  ...  The authors would also like to thank Sebastian Farquhar for helpful discussions and Kyriakos Polymenakos for proof reading a draft of the paper. SK is funded by an Oxford-Man studentship.  ... 
arXiv:1912.02290v5 fatcat:z2liyhbrg5fafh77akxcrw7hoe

Continual Learning: Tackling Catastrophic Forgetting in Deep Neural Networks with Replay Processes [article]

Timothée Lesort
2020 arXiv   pre-print
Therefore, artificial neural networks are often inept to deal with real-life settings such as an autonomous-robot that has to learn on-line to adapt to new situations and overcome new problems without  ...  Artificial neural networks struggle to learn similarly. They often rely on data rigorously preprocessed to learn solutions to specific problems such as classification or regression.  ...  For most of them, in a continual learning situation, the neural network will automatically adapt to the last data only and forget everything learned on the previous one.  ... 
arXiv:2007.00487v3 fatcat:fwdjynkclbchvgo73qhs6biice

SpaRCe: Improved Learning of Reservoir Computing Systems Through Sparse Representations

Luca Manneschi, Andrew C. Lin, Eleni Vasilaki
2021 IEEE Transactions on Neural Networks and Learning Systems  
SpaRCe alleviates the problem of catastrophic forgetting, a problem most evident in standard echo state networks (ESNs) and recurrent neural networks in general, due to increasing the number of task-specialized  ...  Sparse" neural networks, in which relatively few neurons or connections are active, are common in both machine learning and neuroscience.  ...  forgetting, and Herbert Jaeger for feedback on a preliminary version of this work.  ... 
doi:10.1109/tnnls.2021.3102378 fatcat:fmlzcmiylncthfssbjsehbttea

Unified Regularity Measures for Sample-wise Learning and Generalization [article]

Chi Zhang, Xiaoning Ma, Yu Liu, Le Wang, Yuanqi Su, Yuehu Liu
2021 arXiv   pre-print
Motivated by the recent discovery on network memorization and generalization, we proposed a pair of sample regularity measures for both processes with a formulation-consistent representation.  ...  Fundamental machine learning theory shows that different samples contribute unequally both in learning and testing processes. Contemporary studies on DNN imply that such sample di?  ...  artificial intelligence based on them.  ... 
arXiv:2108.03913v1 fatcat:au7vszgeynbare6sewxgvy2iyi

On the Privacy Risks of Deploying Recurrent Neural Networks in Machine Learning Models [article]

Yunhao Yang, Parham Gohari, Ufuk Topcu
2022 arXiv   pre-print
noise to the gradients calculated during training as a part of the so-called DP-SGD algorithm and (ii) adding Gaussian noise to the trainable parameters as a part of a post-training mechanism that we  ...  Additionally, we study the effectiveness of two prominent mitigation methods for preempting MIAs, namely weight regularization and differential privacy.  ...  On the other hand, training machine learning models for an extended number of epochs may not always lead to overfitting.  ... 
arXiv:2110.03054v3 fatcat:2jhzemdgsbfurch65lylyush6u

Continual Learning with Deep Learning Methods in an Application-Oriented Context [article]

Benedikt Pfülb
2022 arXiv   pre-print
One type of machine learning algorithms that can be categorized as "deep learning" model is referred to as Deep Neural Networks (DNNs).  ...  These deep learning methods exhibit amazing capabilities for inferring and storing complex knowledge from high-dimensional data.  ...  RQ 3: How can a novel deep learning model be designed so that it avoids catastrophic forgetting?  ... 
arXiv:2207.06233v1 fatcat:yachl2s6bfhjdmapznjlspsxx4

A Dirichlet Process Mixture of Robust Task Models for Scalable Lifelong Reinforcement Learning

Zhi Wang, Chunlin Chen, Daoyi Dong
2022 IEEE Transactions on Cybernetics  
While reinforcement learning (RL) algorithms are achieving state-of-the-art performance in various challenging tasks, they can easily encounter catastrophic forgetting or interference when faced with lifelong  ...  With extensive experiments conducted on robot navigation and locomotion domains, we show that our method successfully facilitates scalable lifelong RL and outperforms relevant existing methods.  ...  In deep RL, a deep neural network (DNN) is utilized to approximate the Q-function, known as the famous deep Q-network (DQN) [27] .  ... 
doi:10.1109/tcyb.2022.3170485 pmid:35580095 fatcat:5wxhflq25vbthazi25jesbmhim

Deep learning in neural networks: An overview

Jürgen Schmidhuber
2015 Neural Networks  
In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning.  ...  I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs  ...  Conclusion and Outlook Deep Learning (DL) in Neural Networks (NNs) is relevant for Supervised Learning (SL) (Sec. 5), Unsupervised Learning (UL) (Sec. 5), and Reinforcement Learning (RL) (Sec. 6).  ... 
doi:10.1016/j.neunet.2014.09.003 pmid:25462637 fatcat:fniwacdkurh2pgbspkaf6uyhyq

Introduction to deep learning [article]

Lihi Shiloh-Perl, Raja Giryes
2020 arXiv   pre-print
It includes both the basic structures used to design deep neural networks and a brief survey of some of its popular use cases.  ...  Deep Learning (DL) has made a major impact on data science in the last decade. This chapter introduces the basic concepts of this field.  ...  This phenomena is known as catastrophic forgetting [59] .  ... 
arXiv:2003.03253v1 fatcat:65ywiu2wlzcgvgkaqco2hpe3ve

Class-incremental learning: survey and performance evaluation on image classification [article]

Marc Masana, Xialei Liu, Bartlomiej Twardowski, Mikel Menta, Andrew D. Bagdanov, Joost van de Weijer
2021 arXiv   pre-print
The main challenge for incremental learning is catastrophic forgetting, which refers to the precipitous drop in performance on previously learned tasks after learning a new one.  ...  Incremental learning of deep neural networks has seen explosive growth in recent years. Initial work focused on task-incremental learning, where a task-ID is provided at inference time.  ...  1 We do not refer to the scenario where each task only contains a single class, but consider adding a group of classes for each task.  ... 
arXiv:2010.15277v2 fatcat:wacloedzxrea3dgcuwm7xknyxe

Towards Continual Reinforcement Learning: A Review and Perspectives [article]

Khimya Khetarpal, Matthew Riemer, Irina Rish, Doina Precup
2020 arXiv   pre-print
We begin by discussing our perspective on why RL is a natural fit for studying continual learning.  ...  We go on to discuss evaluation of continual RL agents, providing an overview of benchmarks used in the literature and important metrics for understanding agent performance.  ...  We would like to thank Takuya Ito and Martin Klissarov for providing valuable feedback.  ... 
arXiv:2012.13490v1 fatcat:vcleqjnpgrbkvg477d4prmzg2q

Deep Learning for Anomaly Detection in Time-Series Data: Review, Analysis, and Guidelines

Kukjin Choi, Jihun Yi, Changhwa Park, Sungroh Yoon
2021 IEEE Access  
Finally, we offer guidelines for appropriate model selection and training strategy for deep learning-based time series anomaly detection.  ...  Recent deep learning-based works have made impressive progress in this field.  ...  VOLUME 9, 2021 • 2021 Deep Learning: If the training dataset is small compared with the model capacity, the deep-learning model can memorize the dataset. Hence, the model learns the noise.  ... 
doi:10.1109/access.2021.3107975 fatcat:yrlegcnsy5d47ds3vgbzq64qcu

Improving Generalization of Deep Learning Music Classifiers

Morgan Buisson, Pablo Alonso, Dmitry Bogdanov
2021 Zenodo  
Known as being a very powerful tool capable of generalizing better than traditional machine learning approaches, deep learning models still heavily rely on large quantities of annotated data.  ...  All the suggested approaches are exper-imentally assessed on two state-of-the-art CNN architectures for automatic music classification.  ...  On the contrary, a small training dataset tends to be easier to memorize and makes the model prone to overfitting.  ... 
doi:10.5281/zenodo.5554754 fatcat:thqdptf6qfcjtaz5txu5tmj6vq
« Previous Showing results 1 — 15 out of 98 results