10,078 Hits in 2.6 sec

Learning sparse transformations through backpropagation [article]

Peter Bloem
2018 arXiv   pre-print
When such transformations cannot be designed by hand, they can be learned, even through plain backpropagation, for instance in attention mechanisms.  ...  Many transformations in deep learning architectures are sparsely connected.  ...  This scheme is wellestablished, and allows the values of the sparse tensor to be learned efficiently through backpropagation.  ... 
arXiv:1810.09184v1 fatcat:inyg3xtcend5xfcfcii7725edi

Sparse Attentive Backtracking: Long-Range Credit Assignment in Recurrent Networks [article]

Nan Rosemary Ke, Anirudh Goyal, Olexa Bilaniuk, Jonathan Binas, Laurent Charlin, Chris Pal, Yoshua Bengio
2017 arXiv   pre-print
Sparse Attentive Backtracking learns an attention mechanism over the hidden states of the past and selectively backpropagates through paths with high attention weights.  ...  A major drawback of backpropagation through time (BPTT) is the difficulty of learning long-term dependencies, coming from having to propagate credit information backwards through every single step of the  ...  We have proposed Sparse Attentive Backtracking, a new biologically motivated algorithm which aims to combine the strengths of full backpropagation through time and truncated backpropagation through time  ... 
arXiv:1711.02326v1 fatcat:wpupkeqvojbgveuqyaxr3je344

Sparse Attentive Backtracking: Temporal CreditAssignment Through Reminding [article]

Nan Rosemary Ke, Anirudh Goyal, Olexa Bilaniuk, Jonathan Binas, Michael C. Mozer, Chris Pal, Yoshua Bengio
2018 arXiv   pre-print
Based on this principle, we study a novel algorithm which only back-propagates through a few of these temporal skip connections, realized by a learned attention mechanism that associates current states  ...  Learning long-term dependencies in extended temporal sequences requires credit assignment to events far back in the past.  ...  This allows the RNN to learn long-term dependencies, as with full backpropagation through time, while still allowing it to only backtrack for a few steps, as with truncated backpropagation through time  ... 
arXiv:1809.03702v1 fatcat:govzecjpuzd75kbuh5qmfz26zq

Deep Component Analysis via Alternating Direction Neural Networks [article]

Calvin Murdock, Ming-Fang Chang, Simon Lucey
2018 arXiv   pre-print
For inference, we propose a differentiable optimization algorithm implemented using recurrent Alternating Direction Neural Networks (ADNNs) that enable parameter learning using standard backpropagation  ...  Experimentally, we demonstrate performance improvements on a variety of tasks, including single-image depth prediction with sparse output constraints.  ...  However, if this algorithm is composed as a finite sequence of differentiable transformations, then the model parameters can still be learned in the same way by backpropagating gradients through the steps  ... 
arXiv:1803.06407v1 fatcat:tfivbuxbvbfc5lepgeglb5gpru

Deep Sparse-coded Network (DSN)

Youngjune Gwon, Miriam Cha, H. T. Kung
2016 2016 23rd International Conference on Pattern Recognition (ICPR)  
We introduce a novel backpropagation algorithm to finetune the proposed DSN beyond the pretraining via greedy layerwise sparse coding and dictionary learning.  ...  It has been considered difficult to learn a useful feature hierarchy by stacking sparse coding layers in a straightforward manner.  ...  a, all through backpropagation.  ... 
doi:10.1109/icpr.2016.7900029 dblp:conf/icpr/GwonCK16 fatcat:5tp3dvhhprcjxoa5vvagohginm

Memorized Sparse Backpropagation [article]

Zhiyuan Zhang, Pengcheng Yang, Xuancheng Ren, Xu Sun
2019 arXiv   pre-print
Neural network learning is typically slow since backpropagation needs to compute full gradients and backpropagate them across multiple layers.  ...  learning.  ...  Considering that a computation unit composed of one linear transformation and one activation function is the cornerstone of various neural networks, we elaborate on our unified sparse backpropagation framework  ... 
arXiv:1905.10194v2 fatcat:imeg4wejyvctbid2ooaxck3f4m

The Role of Architectural and Learning Constraints in Neural Network Models: A Case Study on Visual Space Coding

Alberto Testolin, Michele De Filippo De Grazia, Marco Zorzi
2017 Frontiers in Computational Neuroscience  
For both learning architectures we also explore the role of sparse coding, which has been identified as a fundamental principle of neural computation.  ...  As a case study, we present a series of simulations investigating the emergence of neural coding of visual space for sensorimotor transformations.  ...  The most stable and accurate learning algorithm was resilient backpropagation (Riedmiller and Braun, 1993) . they could support a supervised mapping to the target motor program through a simple linear  ... 
doi:10.3389/fncom.2017.00013 pmid:28377709 pmcid:PMC5360096 fatcat:kdfbgen7anawpl2p3v4u3aqnse

Sparse Factorization Layers for Neural Networks with Limited Supervision [article]

Parker Koch, Jason J. Corso
2016 arXiv   pre-print
We propose two new network layers that are based on dictionary learning: a sparse factorization layer and a convolutional sparse factorization layer, analogous to fully-connected and convolutional layers  ...  Recently, interest has grown in adapting dictionary learning methods for supervised tasks such as classification and inverse problems.  ...  For example, greedy deep dictionary learning [35] sequentially encodes the data through a series of sparse factorizations.  ... 
arXiv:1612.04468v1 fatcat:eq537teoofb2tfr2uq5qfikbki

Long Distance Relationships without Time Travel: Boosting the Performance of a Sparse Predictive Autoencoder in Sequence Modeling [article]

Jeremy Gordon, David Rawlinson, Subutai Ahmad
2019 arXiv   pre-print
State of the art models such as LSTM and Transformer are trained by backpropagation of losses into prior hidden states and inputs held in memory.  ...  We describe a predictive autoencoder called bRSM featuring recurrent connections, sparse activations, and a boosting rule for improved cell utilization.  ...  Common to all the neural approaches reviewed here is the use of some form of deep-backpropagation, either by unrolling through time (see section 3.1.2 for more detail) or through a finite window of recent  ... 
arXiv:1912.01116v1 fatcat:xam7dlqb4fcjbohtl3zxzuzht4

The backpropagation-based recollection hypothesis: Backpropagated action potentials mediate recall, imagination, language understanding and naming [article]

Zied Ben Houidi
2021 arXiv   pre-print
After stating our hypothesis in details, we challenge its assumptions through a thorough literature review.  ...  with the same high accuracy as a state of the art machine learning classifier.  ...  Similar sparse neurons act for instance as pointers to retrieve the visual features of a cat through backpropagated APs that travel backwards to reactivate selectively the neurons that represent the right  ... 
arXiv:2101.04137v3 fatcat:rzuj6fvutreidkflgjlsv4yevi

Learning sparse representations in reinforcement learning [article]

Jacob Rafati, David C. Noelle
2019 arXiv   pre-print
We provide support for this conjecture through computational simulations, demonstrating the benefits of learned sparse representations for three problematic classic control tasks: Puddle-world, Mountain-car  ...  The sparse conjunctive representations can avoid catastrophic interference while still supporting generalization.  ...  This error value was then backpropagated through the network, using the standard backpropagation of error algorithm (Rumelhart et al., 1986) , and connection weights were updated.  ... 
arXiv:1909.01575v1 fatcat:4eueaamz6nfmhbch7pskbqbfvu

Direct Feedback Alignment with Sparse Connections for Local Learning [article]

Brian Crafton, Abhinav Parihar, Evan Gebhardt, Arijit Raychowdhury
2019 arXiv   pre-print
neuron's dependence on the weights and errors located deeper in the network require exhaustive data movement which presents a key problem in enhancing the performance and energy-efficiency of machine-learning  ...  Using a sparse feedback matrix, we show that a neuron needs only a fraction of the information previously used by the feedback alignment algorithms.  ...  Instead the error signal can be fed back to the shallower layers through completely random linear transformations.  ... 
arXiv:1903.02083v2 fatcat:4pvnkjbwyrepdm4uwlgujthfpu

Unsupervised and Supervised Visual Codes with Restricted Boltzmann Machines [chapter]

Hanlin Goh, Nicolas Thome, Matthieu Cord, Joo-Hwee Lim
2012 Lecture Notes in Computer Science  
The codewords are then fine-tuned to be discriminative through the supervised learning from top-down labels.  ...  In this work, we propose a novel visual codebook learning approach using the restricted Boltzmann machine (RBM) as our generative model. Our contribution is three-fold.  ...  Supervised Fine-Tuning After unsupervised learning, we fine-tune the codebooks through supervised learning.  ... 
doi:10.1007/978-3-642-33715-4_22 fatcat:vjoe6a7qlrdoxhlm424gnptu6y

Generalized Physics-Informed Learning through Language-Wide Differentiable Programming [article]

Christopher Rackauckas
In this manuscript we develop an infrastructure for incorporating deep learning into existing scientific computing code through Differentiable Programming (∂P).  ...  We showcase several examples of physics-informed learning which directly utilizes this extension to existing simulation code: neural surrogate models, machine learning on simulated quantum hardware, and  ...  code is simply Julia code Code written in Julia does not have to target Flux in order to be compatible Existing non-machine learning packages can be used as layers in a machine learning framework Flux  ... 
doi:10.6084/m9.figshare.12751934.v1 fatcat:fagg3aiimnawheazsze7ac67q4

DizzyRNN: Reparameterizing Recurrent Neural Networks for Norm-Preserving Backpropagation [article]

Victor Dorobantu, Per Andre Stromhaug, Jess Renteria
2016 arXiv   pre-print
We propose a reparameterization of standard recurrent neural networks to update linear transformations in a provably norm-preserving way through Givens rotations.  ...  The vanishing and exploding gradient problems are well-studied obstacles that make it difficult for recurrent neural networks to learn long-term time dependencies.  ...  Defining the problem Recurrent neural networks (RNNs) are trained by updating model parameters through gradient descent with backpropagation to minimize a loss function.  ... 
arXiv:1612.04035v1 fatcat:istwzreyhjhovkaoxsh66etf3a
« Previous Showing results 1 — 15 out of 10,078 results