A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Capacity and Trainability in Recurrent Neural Networks
[article]
2017
arXiv
pre-print
Two potential bottlenecks on the expressiveness of recurrent neural networks (RNNs) are their ability to store information about the task in their parameters, and to store information about the input history ...
These results suggest that many previous results comparing RNN architectures are driven primarily by differences in training effectiveness, rather than differences in capacity. ...
ACKNOWLEDGEMENTS We would like to thank Geoffrey Irving, Alex Alemi, Quoc Le, Navdeep Jaitly, and Taco Cohen for helpful feedback. ...
arXiv:1611.09913v3
fatcat:far6nus7trgcldgmmb2y3jdr7i
MinimalRNN: Toward More Interpretable and Trainable Recurrent Neural Networks
[article]
2018
arXiv
pre-print
We introduce MinimalRNN, a new recurrent neural network architecture that achieves comparable performance as the popular gated RNNs with a simplified structure. ...
It employs minimal updates within RNN, which not only leads to efficient learning and testing but more importantly better interpretability and trainability. ...
We empirically study the input-output Jacobians in the scope of recurrent neural networks and show that MinimalRNN is more trainable than existing models. ...
arXiv:1711.06788v2
fatcat:cgvm7q4mbbctnh7wyudrauq2wm
Complex Unitary Recurrent Neural Networks Using Scaled Cayley Transform
2019
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
In the experiments conducted, the scaled Cayley unitary recurrent neural network (scuRNN) achieves comparable or better results than scoRNN and other unitary RNNs without fixing the scaling matrix. ...
Recurrent neural networks (RNNs) have been successfully used on a wide range of sequential data problems. A well known difficulty in using RNNs is the vanishing or exploding gradient problem. ...
This research was supported in part by NSF under grants DMS-1821144 and DMS-1620082. We would also like to thank Devin Willmott for his help on this project. ...
doi:10.1609/aaai.v33i01.33014528
fatcat:a3gckfx2qjh2ddgh2cgtufxnuy
Complex Unitary Recurrent Neural Networks using Scaled Cayley Transform
[article]
2019
arXiv
pre-print
In the experiments conducted, the scaled Cayley unitary recurrent neural network (scuRNN) achieves comparable or better results than scoRNN and other unitary RNNs without fixing the scaling matrix. ...
Recurrent neural networks (RNNs) have been successfully used on a wide range of sequential data problems. A well known difficulty in using RNNs is the vanishing or exploding gradient problem. ...
This research was supported in part by NSF under grants DMS-1821144 and DMS-1620082. We would also like to thank Devin Willmott for his help on this project. ...
arXiv:1811.04142v2
fatcat:3c5td5wwvvchbk6topme2usfta
Predefined Sparseness in Recurrent Sequence Models
[article]
2018
arXiv
pre-print
Inducing sparseness while training neural networks has been shown to yield models with a lower memory footprint but similar effectiveness to dense models. ...
First, in language modeling, we show how to increase hidden state sizes in recurrent layers without increasing the number of parameters, leading to more expressive models. ...
Acknowledgments We thank the anonymous reviewers for their time and effort, and the valuable feedback. ...
arXiv:1808.08720v1
fatcat:epyvwz6wyrewzpqnde7xuena6y
Predefined Sparseness in Recurrent Sequence Models
2018
Proceedings of the 22nd Conference on Computational Natural Language Learning
Inducing sparseness while training neural networks has been shown to yield models with a lower memory footprint but similar effectiveness to dense models. ...
First, in language modeling, we show how to increase hidden state sizes in recurrent layers without increasing the number of parameters, leading to more expressive models. ...
Acknowledgments We thank the anonymous reviewers for their time and effort, and the valuable feedback. ...
doi:10.18653/v1/k18-1032
dblp:conf/conll/DemeesterDGD18
fatcat:33t3hsicvze3fdivgvc75xqsgi
Echo State Neural Machine Translation
[article]
2020
arXiv
pre-print
We present neural machine translation (NMT) models inspired by echo state network (ESN), named Echo State NMT (ESNMT), in which the encoder and decoder layer weights are randomly generated then fixed throughout ...
We show that even with this extremely simple model construction and training procedure, ESNMT can already reach 70-80% quality of fully trainable baselines. ...
Background Echo State Network (Jaeger, 2001 ) is a special type of recurrent neural network, in which the recurrent matrix (known as "reservoir") and input transformation are randomly generated then fixed ...
arXiv:2002.11847v1
fatcat:6qirjfchgvhmheybbxtj5trqv4
On the impressive performance of randomly weighted encoders in summarization tasks
[article]
2020
arXiv
pre-print
In this work, we investigate the performance of untrained randomly initialized encoders in a general class of sequence to sequence models and compare their performance with that of fully-trained encoders ...
We further find that the capacity of the encoder not only improves overall model generalization but also closes the performance gap between untrained randomly initialized and full-trained encoders. ...
This work was partially supported by the IVADO Excellence Scholarship and the Canada First Research Excellence Fund. ...
arXiv:2002.09084v1
fatcat:kra4gsrabjhxvcmbqi2mmnbzdi
Orthogonal Recurrent Neural Networks with Scaled Cayley Transform
[article]
2018
arXiv
pre-print
Recent work on Unitary Recurrent Neural Networks (uRNNs) have been used to address this issue and in some cases, exceed the capabilities of Long Short-Term Memory networks (LSTMs). ...
In several experiments, the proposed scaled Cayley orthogonal recurrent neural network (scoRNN) achieves superior results with fewer trainable parameters than other unitary RNNs. ...
Acknowledgements This research was supported in part by NSF Grants DMS-1317424 and DMS-1620082. ...
arXiv:1707.09520v3
fatcat:xmrdjr3r45frpdx7dkquowzuye
Echo State Speech Recognition
[article]
2021
arXiv
pre-print
We propose automatic speech recognition (ASR) models inspired by echo state network (ESN), in which a subset of recurrent neural networks (RNN) layers in the models are randomly initialized and untrained ...
Overall, we challenge the common practice of training ASR models for all components, and demonstrate that ESN-based models can perform equally well but enable more efficient training and storage than fully-trainable ...
In this paper, we study this topic for ASR models based on recurrent neural networks (RNN). ...
arXiv:2102.09114v1
fatcat:cpiyjlfxyfc5dfhfmitt5bld74
Neural Networks for Delta Hedging
[article]
2021
arXiv
pre-print
In this paper, we explore the landscape of Deep Neural Networks(DNN) based hedging systems by testing the hedging capacity of the following neural architectures: Recurrent Neural Networks, Temporal Convolutional ...
Networks, Attention Networks, and Span Multi-Layer Perceptron Networks. ...
capacity of the following neural architectures: Recurrent Neural Networks, Temporal Convolutional
Networks, Attention Networks, and Span Multi-Layer ...
arXiv:2112.10084v1
fatcat:lo2dd3rkgjdmllbcrmtghh3e6a
Perspectives and challenges for recurrent neural network training
2009
Logic Journal of the IGPL
This idea was born during a Dagstuhl seminar entitled 'Recurrent Neural Networks-Models, Capacities, and Applications' which took place in 2008 and which centered around the connection of RNNs to biological ...
Recurrent neural networks (RNNs) offer flexible machine learning tools which share the learning abilities of feedforward networks and which extend their expression abilities based on dynamical equations ...
These contributions cover a wide area of recent developments in the context of recurrent neural network training. The first contribution is connected to the topic of deep learning. ...
doi:10.1093/jigpal/jzp042
fatcat:5lflti5oljamxm6e42dnnoil5q
Bayesian Recurrent Units and the Forward-Backward Algorithm
[article]
2022
arXiv
pre-print
at a very low cost in terms of trainable parameters. ...
The resulting Bayesian recurrent units can be integrated as recurrent neural networks within deep learning frameworks, while retaining a probabilistic interpretation from the direct correspondence with ...
Acknowledgements This project received funding under NAST: Neural Architectures for Speech Technology, Swiss National Science Foundation grant 200021 185010. ...
arXiv:2207.10486v1
fatcat:ksh4yzk3w5f2jl2bynn7absgle
Towards Real-World Applications of Online Learning Spiral Recurrent Neural Networks
2009
Journal of Intelligent Learning Systems and Applications
We present a new solution called Spiral Recurrent Neural Networks (SpiralRNN) with an online learning based on an extended Kalman filter and gradients as in Real Time Recurrent Learning. ...
In a memory capacity evaluation the number of simultaneously memorized and accurately retrievable trajectories of fixed length was counted. ...
Neural networks, and in particular recurrent neural networks, have proven their suitability at least for offline learning forecast tasks. Examples can be found in [3] or [4] . ...
doi:10.4236/jilsa.2009.11001
fatcat:5foaxtjmhreg3cpmm26hf6pnaq
Towards fuzzification of adaptation rules in self-adaptive architectures
[article]
2021
arXiv
pre-print
In this paper, we focus on exploiting neural networks for the analysis and planning stage in self-adaptive architectures. ...
We show how to navigate in this continuum and create a neural network architecture that naturally embeds the original logical rules and how to gradually scale the learning potential of the network, thus ...
A able ones (and additionally having the ability to set the
similar prediction is used in [15] to predict values in sensor training capacity of the trainable predicates), one can freely
networks ...
arXiv:2112.09468v1
fatcat:2jvtl3gn6jggxoxpbqcqduz4ke
« Previous
Showing results 1 — 15 out of 4,580 results