Filters








42,122 Hits in 2.9 sec

Batch Normalized Recurrent Neural Networks [article]

César Laurent, Gabriel Pereyra, Philémon Brakel, Ying Zhang and Yoshua Bengio
2015 arXiv   pre-print
Recurrent Neural Networks (RNNs) are powerful models for sequential data that have the potential to learn long-term dependencies.  ...  Recent work has shown that normalizing intermediate representations of neural networks can significantly improve convergence rates in feedforward neural networks .  ...  Recurrent Neural Networks Recurrent Neural Networks (RNNs) extend Neural Networks to sequential data.  ... 
arXiv:1510.01378v1 fatcat:lkzvghltmzhitedahionexsz2u

Investigation on the Combination of Batch Normalization and Dropout in BLSTM-based Acoustic Modeling for ASR

Li Wenjie, Gaofeng Cheng, Fengpei Ge, Pengyuan Zhang, Yonghong Yan
2018 Interspeech 2018  
Batch normalization(BN) is a good way to accelerate network training and improve the generalization performance of neural networks.  ...  The Long Short-Term Memory (LSTM) architecture is a very special kind of recurrent neural network for modeling sequential data like speech.  ...  People hold different views on how to use batch normalization on the recurrent neural network [9] [10].  ... 
doi:10.21437/interspeech.2018-1597 dblp:conf/interspeech/LiCGZ018 fatcat:es7g237ho5cujihrcbqegao3fa

Layer Normalization [article]

Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton
2016 arXiv   pre-print
However, the effect of batch normalization is dependent on the mini-batch size and it is not obvious how to apply it to recurrent neural networks.  ...  It is also straightforward to apply to recurrent neural networks by computing the normalization statistics separately at each time step.  ...  Layer normalized recurrent neural networks The recent sequence to sequence models [Sutskever et al., 2014] utilize compact recurrent neural networks to solve sequential prediction problems in natural  ... 
arXiv:1607.06450v1 fatcat:w2kufqz6mfhdjb4okiyrpxyexe

Recurrent Batch Normalization [article]

Tim Cooijmans, Nicolas Ballas, César Laurent, Çağlar Gülçehre, Aaron Courville
2017 arXiv   pre-print
We propose a reparameterization of LSTM that brings the benefits of batch normalization to recurrent neural networks.  ...  Whereas previous works only apply batch normalization to the input-to-hidden transformation of RNNs, we demonstrate that it is both possible and beneficial to batch-normalize the hidden-to-hidden transition  ...  Liao & Poggio (2016) simultaneously investigated batch normalization in recurrent neural networks, albeit only for very short sequences (10 steps).  ... 
arXiv:1603.09025v5 fatcat:eyradinuvrfxbpo3cclkyi3vhy

Batch-normalized recurrent highway networks

Chi Zhang, Thang Nguyen, Shagan Sah, Raymond Ptucha, Alexander Loui, Carl Salvaggio
2017 2017 IEEE International Conference on Image Processing (ICIP)  
In this work, batch normalized recurrent highway networks are proposed to control the gradient flow in an improved way for network convergence.  ...  Experimental results indicate that the batch normalized recurrent highway networks converge faster and performs better compared with the traditional LSTM and RHN based models.  ...  Fig. 1 . 1 The architecture of batch normalized recurrent neural networks.  ... 
doi:10.1109/icip.2017.8296359 dblp:conf/icip/ZhangNSPLS17 fatcat:dgrdgqdqqzajrfod7heefr45dq

Regularity Normalization: Neuroscience-Inspired Unsupervised Attention across Neural Network Layers

Baihan Lin
2021 Entropy  
Lastly, the unsupervised attention mechanisms is a useful probing tool for neural networks by tracking the dependency and critical learning stages across layers and recurrent time steps of deep networks  ...  Treating the neural network optimization process as a partially observable model selection problem, the regularity normalization constrains the implicit space by a normalization factor, the universal code  ...  Normalization Methods in Neural Networks The batch normalization (BN) performs a global normalization along the batch dimension such that for each neuron in a layer, the activation over all the mini-batch  ... 
doi:10.3390/e24010059 pmid:35052085 pmcid:PMC8774926 fatcat:2m6wl4xsrbeixiysjgwgkld2xu

AdaFilter: Adaptive Filter Fine-Tuning for Deep Transfer Learning

Yunhui Guo, Yandong Li, Liqiang Wang, Tajana Rosing
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
We use a recurrent gated network to selectively fine-tune convolutional filters based on the activations of the previous layer.  ...  There is an increasing number of pre-trained deep neural network models. However, it is still unclear how to effectively use these models for a new task.  ...  Gated Batch Normalization Batch normalization (BN) (Ioffe and Szegedy 2015) layer is designed to alleviate the issue of internal covariate shifting of training deep neural networks.  ... 
doi:10.1609/aaai.v34i04.5824 fatcat:pcv2onzk4rdljlrs3k62b6bxqm

AdaFilter: Adaptive Filter Fine-tuning for Deep Transfer Learning [article]

Yunhui Guo, Yandong Li, Liqiang Wang, Tajana Rosing
2019 arXiv   pre-print
We use a recurrent gated network to selectively fine-tune convolutional filters based on the activations of the previous layer.  ...  There is an increasing number of pre-trained deep neural network models. However, it is still unclear how to effectively use these models for a new task.  ...  Gated Batch Normalization Batch normalization (BN) (Ioffe and Szegedy 2015) layer is designed to alleviate the issue of internal covariate shifting of training deep neural networks.  ... 
arXiv:1911.09659v2 fatcat:54e6rypzpzd7jckz5w7baqonfe

A constrained recursion algorithm for batch normalization of tree-sturctured LSTM [article]

Ruo Ando, Yoshiyasu Takefuji
2020 arXiv   pre-print
Proposed method enables us to explore the optimized selection of hyperparameters of recursive neural network implementation by changing the constraints of our recursion algorithm.  ...  To name a few, hyperparamters such as the interval of state initialization, the number of batches for normalization have been left unexplored specifically in applying batch normalization for reducing training  ...  Deep recurrent neural network For coping with sequential data which have potentially long-term dependencies, Recurrent Neural Network (RNN) is powerful models.  ... 
arXiv:2008.09409v1 fatcat:qjuf2luxpjfz3k4qrsd7gbtie4

A Novel Architecture for Predicting Pneumonia Patients by using LSTM, GRU and CNN

2019 International Journal of Engineering and Advanced Technology  
The Models based only on Long-Short Term Memory (LSTM), Gated Recurrent Unit (GRU) and Dynamic Recurrent Neural Network (Dynamic RNN) are not sufficient for prediction of a pneumonia patients using image  ...  The proposed network uses the properties of LSTM, GRU and Convolutional Neural Network like capacity to remember long-term memory and handling the input parameters dynamically.  ...  Thus a new network named RNN which can also be called as Recurrent Neural Network.  ... 
doi:10.35940/ijeat.a1353.109119 fatcat:mrgbxaxa6nectkgq4fxaw4zq3i

Regularity Normalization: Neuroscience-Inspired Unsupervised Attention across Neural Network Layers [article]

Baihan Lin
2021 arXiv   pre-print
Lastly, the unsupervised attention mechanisms is a useful probing tool for neural networks by tracking the dependency and critical learning stages across layers and recurrent time steps of deep networks  ...  Treating the neural network optimization process as a partially observable model selection problem, the regularity normalization constrains the implicit space by a normalization factor, the universal code  ...  Normalization methods in neural networks.  ... 
arXiv:1902.10658v13 fatcat:bb7helcugrehddai22n3zwtfgu

A comprehensive study of batch construction strategies for recurrent neural networks in MXNet [article]

Patrick Doetsch, Pavel Golik, Hermann Ney
2017 arXiv   pre-print
In this work we compare different batch construction methods for mini-batch training of recurrent neural networks.  ...  While popular implementations like TensorFlow and MXNet suggest a bucketing approach to improve the parallelization capabilities of the recurrent training process, we propose a simple ordering strategy  ...  Modern acoustic models therefore use recurrent neural networks (RNN) to model long temporal dependencies.  ... 
arXiv:1705.02414v1 fatcat:pv7gww5smzfu5gfqmguszhrwuq

Advanced Convolutional Neural Network-Based Hybrid Acoustic Models for Low-Resource Speech Recognition

Tessfu Geteye Fantaye, Junqing Yu, Tulu Tilahun Hailu
2020 Computers  
Recently, the recurrent neural networks (RNNs) have shown great abilities for modeling long-term context dependencies.  ...  The proposed neural network models achieve 0.1–42.79% relative performance improvements over their corresponding feed-forward DNN, CNN, bidirectional RNN (BRNN), or bidirectional gated recurrent unit (  ...  Figure 3 . 3 Gated recurrent convolutional network (GRCL). Figure 4 . 4 Highway recurrent convolutional neural network (HRCL).  ... 
doi:10.3390/computers9020036 fatcat:k54s5pj7grggffsbibjpm5jk2q

Recurrent neural network-based fault detector for aileron failures of aircraft

Nobuyuki Yoshikawa, Nacim Belkhir, Sinji Suzuki
2017 2017 11th Asian Control Conference (ASCC)  
This paper empirically investigate the design of a fault detection mechanism based on Long Short Term Memory (LSTM) neural network.  ...  Given an equation based model that approximate the behavior of aircraft ailerons, the fault detector aims at predicting the state of aircraft: the normal state for which no failure are observed, or four  ...  Description The concept of recurrent neural network can be traced back to [14] Fig.1 shows a typical recurrent neural network, such that an unfolded structure.  ... 
doi:10.1109/ascc.2017.8287391 dblp:conf/ascc/YoshikawaBS17 fatcat:2ecgjaqtxjghzkniaxbm35mltm

Attentive batch normalization for lstm-based acoustic modeling of speech recognition [article]

Fenglin Ding, Wu Guo, Lirong Dai, Jun Du
2020 arXiv   pre-print
Batch normalization (BN) is an effective method to accelerate model training and improve the generalization performance of neural networks.  ...  In the proposed method, an auxiliary network is used to dynamically generate the scaling and shifting parameters in batch normalization, and attention mechanisms are introduced to improve their regularized  ...  However, it is more difficult to apply BN in a recurrent architecture. Researchers have investigated many algorithms to apply BN on the recurrent neural network [11, 12, 13, 14] .  ... 
arXiv:2001.00129v1 fatcat:rco7uvhqwnapfmvazlvqda3jpa
« Previous Showing results 1 — 15 out of 42,122 results