Filters








26,659 Hits in 3.2 sec

Dynamics of Learning Near Singularities in Layered Networks

Haikun Wei, Jun Zhang, Florent Cousseau, Tomoko Ozeki, Shun-ichi Amari
2008 Neural Computation  
We plot dynamic vector fields to demonstrate the universal trajectories of learning near singularities.  ...  We explicitly analyze the trajectories of learning near singularities in hierarchical networks, such as multilayer perceptrons and radial basis function networks which include permutation symmetry of hidden  ...  This paper investigates the dynamics of learning near singularities in layered networks, by discussing the stability, the trajectories of learning, and the plateau phenomena in a unified framework.  ... 
doi:10.1162/neco.2007.12-06-414 pmid:18045020 fatcat:oshrm7e2cvaujjav6gtfelroum

Singular Learning of Deep Multilayer Perceptrons for EEG-Based Emotion Recognition

Weili Guo, Guangyu Li, Jianfeng Lu, Jian Yang
2021 Frontiers in Computer Science  
In this paper, we mainly focus on this problem, and analyze the singular learning dynamics of deep multilayer perceptrons theoretically and numerically.  ...  However, there exist singularities in the parameter space of deep neural networks, which may dramatically slow down the training process.  ...  THEORETICAL ANALYSIS OF SINGULAR LEARNING DYNAMICS OF DEEP MULTILAYER PERCEPTRONS In this section, we theoretically analyze the learning dynamics near singularities of deep MLPs for the EEG-based emotion  ... 
doi:10.3389/fcomp.2021.786964 fatcat:zjouvpktmzef3jua6xvnq4d3tm

Dynamics of Learning in MLP: Natural Gradient and Singularity Revisited

Shun-ichi Amari, Tomoko Ozeki, Ryo Karakida, Yuki Yoshida, Masato Okada
2018 Neural Computation  
One of our results is a full exploration of the dynamical behaviors of stochastic gradient learning in an elementary singular network.  ...  The dynamics of supervised learning play a main role in deep learning, which takes place in the parameter space of a multilayer perceptron (MLP).  ...  We analyzed the dynamics of learning in the neighborhood of a singular region by using an elementary MLP model.  ... 
doi:10.1162/neco_a_01029 pmid:29064781 fatcat:a3vrorp43bf5vmy2fji4k7oxju

Curvature-corrected learning dynamics in deep neural networks

Dongsung Huh
2020 International Conference on Machine Learning  
In this work, we investigate how curvature correction modifies the learning dynamics in deep linear neural networks and provide analytical solutions.  ...  Deep neural networks exhibit complex learning dynamics due to their non-convex loss landscapes.  ...  For √ NGD, the nonlinearity of map learning dynamics is always less than that of one-hidden-layer networks: d eff < 2.  ... 
dblp:conf/icml/Huh20 fatcat:klcm6l76cjcafgzvir6tw3shvq

Exact solutions to the nonlinear dynamics of learning in deep linear neural networks [article]

Andrew M. Saxe, James L. McClelland, Surya Ganguli
2014 arXiv   pre-print
Despite the widespread practical success of deep learning methods, our theoretical understanding of the dynamics of learning in deep neural networks remains quite sparse.  ...  Despite the linearity of their input-output map, such networks have nonlinear gradient descent dynamics on weights that change with the addition of each new hidden layer.  ...  Figure 3 : 3 Left: Dynamics of learning in a three layer neural network.  ... 
arXiv:1312.6120v3 fatcat:emocqsm7irffzca7gwgivnyrq4

Skip Connections Eliminate Singularities [article]

A. Emin Orhan, Xaq Pitkow
2018 arXiv   pre-print
Several such singularities have been identified in previous works: (i) overlap singularities caused by the permutation symmetry of nodes in a given layer, (ii) elimination singularities corresponding to  ...  These singularities cause degenerate manifolds in the loss landscape that slow down learning.  ...  , of IARPA, DoI/IBC, or the U.S.  ... 
arXiv:1701.09175v8 fatcat:vt3ir7yuzrb2hltkohwjfsruna

Analysis of Dropout in Online Learning [article]

Kazuyuki Hara
2017 arXiv   pre-print
This paper presents our investigation on the effect of dropout in online learning. We analyzed the effect of dropout on convergence speed near the singular point.  ...  Deep learning is the state-of-the-art in fields such as visual object recognition and speech recognition. This learning uses a large number of layers and a huge number of units and connections.  ...  For a singular teacher, SGD shows the slow dynamics of R ii near the singular point, that is, R ii = 1.  ... 
arXiv:1711.03343v1 fatcat:y3zow44agvhfrightq3sikabt4

Slow Dynamics Due to Singularities of Hierarchical Learning Machines

Hyeyoung Park, Masato Inoue, Masato Okada
2005 Progress of Theoretical Physics Supplement  
Recently, slow dynamics in learning of neural networks has been known to be closely related to singularities, which exist in parameter spaces of hierarchical learning models.  ...  To show the influence of singular structure on learning dynamics, we take statistical mechanical approaches and investigate online-learning dynamics under various learning scenario with different relationship  ...  By taking a geometrical viewpoint and statistical mechanical approach on learning dynamics, we found the existence of quasi-plateau and severeness of slow dynamics in near-singular case, which is interesting  ... 
doi:10.1143/ptps.157.275 fatcat:zff5ioeifvbf7jgylg6syfo4fq

Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10, 000-Layer Vanilla Convolutional Neural Networks

Lechao Xiao, Yasaman Bahri, Jascha Sohl-Dickstein, Samuel S. Schoenholz, Jeffrey Pennington
2018 International Conference on Machine Learning  
or even thousands of layers.  ...  In recent years, state-of-the-art methods in computer vision have utilized increasingly deep convolutional neural network architectures (CNNs), with some of the most successful models employing hundreds  ...  Eqn. (2.9) defines the linear dynamics of random convolutional neural networks near their fixed points and is the basis for the in-depth analysis of the following subsections.  ... 
dblp:conf/icml/XiaoBSSP18 fatcat:kystouejzbbevowfclztn2dhce

Statistical mechanical analysis of learning dynamics of two-layer perceptron with multiple output units

Yuki Yoshida, Ryo Karakida, Masato Okada, Shun-ichi Amari
2019 Journal of Physics A: Mathematical and Theoretical  
This paper uses a statistical mechanical formalization to analyze the dynamics of learning in a two-layer perceptron with multidimensional output.  ...  Furthermore, we showed theoretically that singular-region-driven plateaus seldom occur in the learning process in the case of orthogonalized initializations.  ...  The learning dynamics of neural networks have been studied in various settings.  ... 
doi:10.1088/1751-8121/ab0669 fatcat:f4tei2qpnrhajeqe2uy3usrwsa

The Neural Race Reduction: Dynamics of Abstraction in Gated Networks

Andrew Saxe, Shagun Sodhani, Sam Jay Lewallen
2022 International Conference on Machine Learning  
Crucially, because of the gating, these networks can compute nonlinear functions of their input. We derive an exact reduction and, for certain cases, exact solutions to the dynamics of learning.  ...  Our analysis demonstrates that the learning dynamics in structured networks can be conceptualized as a neural race with an implicit bias towards shared representations, which then govern the model's ability  ...  A.S. is a CIFAR Azrieli Global Scholar in the Learning in Machines & Brains program.  ... 
dblp:conf/icml/SaxeSL22 fatcat:yoy6zyqulfacbgikb6mqjqcupi

The Neural Race Reduction: Dynamics of Abstraction in Gated Networks [article]

Andrew M. Saxe, Shagun Sodhani, Sam Lewallen
2022 arXiv   pre-print
Crucially, because of the gating, these networks can compute nonlinear functions of their input. We derive an exact reduction and, for certain cases, exact solutions to the dynamics of learning.  ...  Our analysis demonstrates that the learning dynamics in structured networks can be conceptualized as a neural race with an implicit bias towards shared representations, which then govern the model's ability  ...  A.S. is a CIFAR Azrieli Global Scholar in the Learning in Machines & Brains program.  ... 
arXiv:2207.10430v1 fatcat:ye7zamhdqre5leldby2vmqb7ru

Scale Normalization [article]

Henry Z. Lo and Kevin Amaral and Wei Ding
2016 arXiv   pre-print
One of the difficulties of training deep neural networks is caused by improper scaling between layers.  ...  Results suggest that isometry is important in the beginning of learning, and maintaining it leads to faster learning.  ...  Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. ICLR, 2014.  ... 
arXiv:1604.07796v1 fatcat:simrhxrvzrgh3pbuakr2wzfwvu

The emergence of number and syntax units in

Yair Lakretz, German Kruszewski, Theo Desbordes, Dieuwke Hupkes, Stanislas Dehaene, Marco Baroni
2019 Proceedings of the 2019 Conference of the North  
The emergence of number and syntax units in LSTM language models Lakretz, Y.; Kruszewski, G.; Desbordes, T.; Hupkes, D.; Dehaene, S.; Baroni, M.  ...  The singular and plural units had emerged at the second layer of the network.  ...  The singular unit didn't show a similar effect in this case, which highlights the importance of using carefully crafted stimuli, as in the nounPP and nounPPAdv tasks, for understanding network dynamics  ... 
doi:10.18653/v1/n19-1002 dblp:conf/naacl/LakretzKDHDB19 fatcat:3mnoalkatvgrhafoefihhifd7e

The emergence of number and syntax units in LSTM language models [article]

Yair Lakretz, German Kruszewski, Theo Desbordes, Dieuwke Hupkes, Stanislas Dehaene, Marco Baroni
2019 arXiv   pre-print
We present here a detailed study of the inner mechanics of number tracking in LSTMs at the single neuron level.  ...  We conclude that LSTMs are, to some extent, implementing genuinely syntactic processing mechanisms, paving the way to a more general understanding of grammatical encoding in LSTMs.  ...  The singular and plural units had emerged at the second layer of the network.  ... 
arXiv:1903.07435v2 fatcat:wpps2oavjffxtc5v5v6baj3gaq
« Previous Showing results 1 — 15 out of 26,659 results