Filters








4,580 Hits in 4.8 sec

Capacity and Trainability in Recurrent Neural Networks [article]

Jasmine Collins, Jascha Sohl-Dickstein, David Sussillo
2017 arXiv   pre-print
Two potential bottlenecks on the expressiveness of recurrent neural networks (RNNs) are their ability to store information about the task in their parameters, and to store information about the input history  ...  These results suggest that many previous results comparing RNN architectures are driven primarily by differences in training effectiveness, rather than differences in capacity.  ...  ACKNOWLEDGEMENTS We would like to thank Geoffrey Irving, Alex Alemi, Quoc Le, Navdeep Jaitly, and Taco Cohen for helpful feedback.  ... 
arXiv:1611.09913v3 fatcat:far6nus7trgcldgmmb2y3jdr7i

MinimalRNN: Toward More Interpretable and Trainable Recurrent Neural Networks [article]

Minmin Chen
2018 arXiv   pre-print
We introduce MinimalRNN, a new recurrent neural network architecture that achieves comparable performance as the popular gated RNNs with a simplified structure.  ...  It employs minimal updates within RNN, which not only leads to efficient learning and testing but more importantly better interpretability and trainability.  ...  We empirically study the input-output Jacobians in the scope of recurrent neural networks and show that MinimalRNN is more trainable than existing models.  ... 
arXiv:1711.06788v2 fatcat:cgvm7q4mbbctnh7wyudrauq2wm

Complex Unitary Recurrent Neural Networks Using Scaled Cayley Transform

Kehelwala D. G. Maduranga, Kyle E. Helfrich, Qiang Ye
2019 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
In the experiments conducted, the scaled Cayley unitary recurrent neural network (scuRNN) achieves comparable or better results than scoRNN and other unitary RNNs without fixing the scaling matrix.  ...  Recurrent neural networks (RNNs) have been successfully used on a wide range of sequential data problems. A well known difficulty in using RNNs is the vanishing or exploding gradient problem.  ...  This research was supported in part by NSF under grants DMS-1821144 and DMS-1620082. We would also like to thank Devin Willmott for his help on this project.  ... 
doi:10.1609/aaai.v33i01.33014528 fatcat:a3gckfx2qjh2ddgh2cgtufxnuy

Complex Unitary Recurrent Neural Networks using Scaled Cayley Transform [article]

Kehelwala D. G. Maduranga, Kyle E. Helfrich, Qiang Ye
2019 arXiv   pre-print
In the experiments conducted, the scaled Cayley unitary recurrent neural network (scuRNN) achieves comparable or better results than scoRNN and other unitary RNNs without fixing the scaling matrix.  ...  Recurrent neural networks (RNNs) have been successfully used on a wide range of sequential data problems. A well known difficulty in using RNNs is the vanishing or exploding gradient problem.  ...  This research was supported in part by NSF under grants DMS-1821144 and DMS-1620082. We would also like to thank Devin Willmott for his help on this project.  ... 
arXiv:1811.04142v2 fatcat:3c5td5wwvvchbk6topme2usfta

Predefined Sparseness in Recurrent Sequence Models [article]

Thomas Demeester, Johannes Deleu, Fréderic Godin, Chris Develder
2018 arXiv   pre-print
Inducing sparseness while training neural networks has been shown to yield models with a lower memory footprint but similar effectiveness to dense models.  ...  First, in language modeling, we show how to increase hidden state sizes in recurrent layers without increasing the number of parameters, leading to more expressive models.  ...  Acknowledgments We thank the anonymous reviewers for their time and effort, and the valuable feedback.  ... 
arXiv:1808.08720v1 fatcat:epyvwz6wyrewzpqnde7xuena6y

Predefined Sparseness in Recurrent Sequence Models

Thomas Demeester, Johannes Deleu, Fréderic Godin, Chris Develder
2018 Proceedings of the 22nd Conference on Computational Natural Language Learning  
Inducing sparseness while training neural networks has been shown to yield models with a lower memory footprint but similar effectiveness to dense models.  ...  First, in language modeling, we show how to increase hidden state sizes in recurrent layers without increasing the number of parameters, leading to more expressive models.  ...  Acknowledgments We thank the anonymous reviewers for their time and effort, and the valuable feedback.  ... 
doi:10.18653/v1/k18-1032 dblp:conf/conll/DemeesterDGD18 fatcat:33t3hsicvze3fdivgvc75xqsgi

Echo State Neural Machine Translation [article]

Ankush Garg, Yuan Cao, Qi Ge
2020 arXiv   pre-print
We present neural machine translation (NMT) models inspired by echo state network (ESN), named Echo State NMT (ESNMT), in which the encoder and decoder layer weights are randomly generated then fixed throughout  ...  We show that even with this extremely simple model construction and training procedure, ESNMT can already reach 70-80% quality of fully trainable baselines.  ...  Background Echo State Network (Jaeger, 2001 ) is a special type of recurrent neural network, in which the recurrent matrix (known as "reservoir") and input transformation are randomly generated then fixed  ... 
arXiv:2002.11847v1 fatcat:6qirjfchgvhmheybbxtj5trqv4

On the impressive performance of randomly weighted encoders in summarization tasks [article]

Jonathan Pilault, Jaehong Park, Christopher Pal
2020 arXiv   pre-print
In this work, we investigate the performance of untrained randomly initialized encoders in a general class of sequence to sequence models and compare their performance with that of fully-trained encoders  ...  We further find that the capacity of the encoder not only improves overall model generalization but also closes the performance gap between untrained randomly initialized and full-trained encoders.  ...  This work was partially supported by the IVADO Excellence Scholarship and the Canada First Research Excellence Fund.  ... 
arXiv:2002.09084v1 fatcat:kra4gsrabjhxvcmbqi2mmnbzdi

Orthogonal Recurrent Neural Networks with Scaled Cayley Transform [article]

Kyle Helfrich, Devin Willmott, Qiang Ye
2018 arXiv   pre-print
Recent work on Unitary Recurrent Neural Networks (uRNNs) have been used to address this issue and in some cases, exceed the capabilities of Long Short-Term Memory networks (LSTMs).  ...  In several experiments, the proposed scaled Cayley orthogonal recurrent neural network (scoRNN) achieves superior results with fewer trainable parameters than other unitary RNNs.  ...  Acknowledgements This research was supported in part by NSF Grants DMS-1317424 and DMS-1620082.  ... 
arXiv:1707.09520v3 fatcat:xmrdjr3r45frpdx7dkquowzuye

Echo State Speech Recognition [article]

Harsh Shrivastava, Ankush Garg, Yuan Cao, Yu Zhang, Tara Sainath
2021 arXiv   pre-print
We propose automatic speech recognition (ASR) models inspired by echo state network (ESN), in which a subset of recurrent neural networks (RNN) layers in the models are randomly initialized and untrained  ...  Overall, we challenge the common practice of training ASR models for all components, and demonstrate that ESN-based models can perform equally well but enable more efficient training and storage than fully-trainable  ...  In this paper, we study this topic for ASR models based on recurrent neural networks (RNN).  ... 
arXiv:2102.09114v1 fatcat:cpiyjlfxyfc5dfhfmitt5bld74

Neural Networks for Delta Hedging [article]

Guijin Son, Joocheol Kim
2021 arXiv   pre-print
In this paper, we explore the landscape of Deep Neural Networks(DNN) based hedging systems by testing the hedging capacity of the following neural architectures: Recurrent Neural Networks, Temporal Convolutional  ...  Networks, Attention Networks, and Span Multi-Layer Perceptron Networks.  ...  capacity of the following neural architectures: Recurrent Neural Networks, Temporal Convolutional Networks, Attention Networks, and Span Multi-Layer  ... 
arXiv:2112.10084v1 fatcat:lo2dd3rkgjdmllbcrmtghh3e6a

Perspectives and challenges for recurrent neural network training

M. Gori, B. Hammer, P. Hitzler, G. Palm
2009 Logic Journal of the IGPL  
This idea was born during a Dagstuhl seminar entitled 'Recurrent Neural Networks-Models, Capacities, and Applications' which took place in 2008 and which centered around the connection of RNNs to biological  ...  Recurrent neural networks (RNNs) offer flexible machine learning tools which share the learning abilities of feedforward networks and which extend their expression abilities based on dynamical equations  ...  These contributions cover a wide area of recent developments in the context of recurrent neural network training. The first contribution is connected to the topic of deep learning.  ... 
doi:10.1093/jigpal/jzp042 fatcat:5lflti5oljamxm6e42dnnoil5q

Bayesian Recurrent Units and the Forward-Backward Algorithm [article]

Alexandre Bittar, Philip N. Garner
2022 arXiv   pre-print
at a very low cost in terms of trainable parameters.  ...  The resulting Bayesian recurrent units can be integrated as recurrent neural networks within deep learning frameworks, while retaining a probabilistic interpretation from the direct correspondence with  ...  Acknowledgements This project received funding under NAST: Neural Architectures for Speech Technology, Swiss National Science Foundation grant 200021 185010.  ... 
arXiv:2207.10486v1 fatcat:ksh4yzk3w5f2jl2bynn7absgle

Towards Real-World Applications of Online Learning Spiral Recurrent Neural Networks

Rudolf SOLLACHER, Huaien GAO
2009 Journal of Intelligent Learning Systems and Applications  
We present a new solution called Spiral Recurrent Neural Networks (SpiralRNN) with an online learning based on an extended Kalman filter and gradients as in Real Time Recurrent Learning.  ...  In a memory capacity evaluation the number of simultaneously memorized and accurately retrievable trajectories of fixed length was counted.  ...  Neural networks, and in particular recurrent neural networks, have proven their suitability at least for offline learning forecast tasks. Examples can be found in [3] or [4] .  ... 
doi:10.4236/jilsa.2009.11001 fatcat:5foaxtjmhreg3cpmm26hf6pnaq

Towards fuzzification of adaptation rules in self-adaptive architectures [article]

Tomáš Bureš, Petr Hnětynka, Martin Kruliš, Danylo Khalyeyev, Sebastian Hahner, Stephan Seifermann, Maximilian Walter, Robert Heinrich
2021 arXiv   pre-print
In this paper, we focus on exploiting neural networks for the analysis and planning stage in self-adaptive architectures.  ...  We show how to navigate in this continuum and create a neural network architecture that naturally embeds the original logical rules and how to gradually scale the learning potential of the network, thus  ...  A able ones (and additionally having the ability to set the similar prediction is used in [15] to predict values in sensor training capacity of the trainable predicates), one can freely networks  ... 
arXiv:2112.09468v1 fatcat:2jvtl3gn6jggxoxpbqcqduz4ke
« Previous Showing results 1 — 15 out of 4,580 results