A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Recurrent Batch Normalization
[article]
2017
arXiv
pre-print
We propose a reparameterization of LSTM that brings the benefits of batch normalization to recurrent neural networks. ...
Whereas previous works only apply batch normalization to the input-to-hidden transformation of RNNs, we demonstrate that it is both possible and beneficial to batch-normalize the hidden-to-hidden transition ...
We suspect that the previous difficulties with recurrent batch normalization reported in Laurent et al. (2016) ; Amodei et al. (2015) are largely due to improper initialization of the batch normalization ...
arXiv:1603.09025v5
fatcat:eyradinuvrfxbpo3cclkyi3vhy
Batch Normalized Recurrent Neural Networks
[article]
2015
arXiv
pre-print
In particular, batch normalization, which uses mini-batch statistics to standardize features, was shown to significantly reduce training time. ...
Recurrent Neural Networks (RNNs) are powerful models for sequential data that have the potential to learn long-term dependencies. ...
Those results suggest that this way of applying batch normalization in the recurrent networks is not optimal. It seems that batch normalization hurts the training procedure. ...
arXiv:1510.01378v1
fatcat:lkzvghltmzhitedahionexsz2u
Batch-normalized recurrent highway networks
2017
2017 IEEE International Conference on Image Processing (ICIP)
In this work, batch normalized recurrent highway networks are proposed to control the gradient flow in an improved way for network convergence. ...
Experimental results indicate that the batch normalized recurrent highway networks converge faster and performs better compared with the traditional LSTM and RHN based models. ...
Fig. 1 . 1 The architecture of batch normalized recurrent neural networks. ...
doi:10.1109/icip.2017.8296359
dblp:conf/icip/ZhangNSPLS17
fatcat:dgrdgqdqqzajrfod7heefr45dq
Improving Gated Recurrent Unit Based Acoustic Modeling with Batch Normalization and Enlarged Context
[article]
2018
arXiv
pre-print
Recently, we proposed a RNN model called minimal gated recurrent unit with input projection (mGRUIP), in which a context module namely temporal convolution, is specifically designed to model the future ...
In the literature, batch normalization has been applied to RNN in different ways. ...
This prompts us to use batch normalization to improve the convergence of optimization process. ...
arXiv:1811.10169v1
fatcat:eodop4tmkzhu7mzyvrkiejinru
Implementation of a batch normalized deep LSTM recurrent network on a smartphone for human activity recognition
2019
2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI)
In this paper we present a Long-Short Term Memory (LSTM) deep recurrent neural network (RNN) model for the classification of human daily life activities by using the accelerometer and gyroscope data of ...
To make the network fast and robust we have employed dropout regularization and the recently introduced batch normalization method [13] . ...
We have utilized the sequential model and with dense, conv1D, maxooling1D, dropout, and batch normalization layers.
B. Model deployment, compilation and training
C. ...
doi:10.1109/bhi.2019.8834480
dblp:conf/bhi/ZebinBOCP19
fatcat:yeaiu3pdg5guzier2tn53odhvu
Human activity recognition from inertial sensor time-series using batch normalized deep LSTM recurrent networks
2018
2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)
Further, we show that this accuracy can be achieved with almost four times fewer training epochs by using a batch normalization approach. ...
In this paper we present a Long-Short Term Memory (LSTM) deep recurrent neural network for the classification of six daily life activities from accelerometer and gyroscope data. ...
Batch normalization has recently been introduced to overcome this by normalizing the x t and h t−1 activations going into each layer by applying a covariate shift. ...
doi:10.1109/embc.2018.8513115
pmid:30440301
fatcat:vlu6dej73jhznh27xbkocbxfey
Streaming Normalization: Towards Simpler and More Biologically-plausible Normalizations for Online and Recurrent Learning
[article]
2016
arXiv
pre-print
We systematically explored a spectrum of normalization algorithms related to Batch Normalization (BN) and propose a generalized formulation that simultaneously solves two major limitations of BN: (1) online ...
Unlike previous approaches, our technique can be applied out of the box to all learning scenarios (e.g., online learning, batch learning, fully-connected, convolutional, feedforward, recurrent and mixed ...
Generalization to Recurrent Learning In this section, we generalize Sample Normalization, General Batch Normalization and Streaming Normalization to recurrent learning. ...
arXiv:1610.06160v1
fatcat:eg6vri5hfvgknb6buyqamoyvta
Investigation on the Combination of Batch Normalization and Dropout in BLSTM-based Acoustic Modeling for ASR
2018
Interspeech 2018
In this paper, we explored some novel approaches to add batch normalization to the LSTM model in bidirectional mode. ...
However, applying batch normalization in the LSTM model is more complicated and challenging than in the feed-forward network. ...
People hold different views on how to use batch normalization on the recurrent neural network [9] [10]. ...
doi:10.21437/interspeech.2018-1597
dblp:conf/interspeech/LiCGZ018
fatcat:es7g237ho5cujihrcbqegao3fa
Layer Normalization
[article]
2016
arXiv
pre-print
However, the effect of batch normalization is dependent on the mini-batch size and it is not obvious how to apply it to recurrent neural networks. ...
Layer normalization is very effective at stabilizing the hidden state dynamics in recurrent networks. ...
., 2016] suggests the best performance of recurrent batch normalization is obtained by keeping independent normalization statistics for each time-step. ...
arXiv:1607.06450v1
fatcat:w2kufqz6mfhdjb4okiyrpxyexe
The microbiome in lung cancer tissue and recurrence-free survival
2019
Cancer Epidemiology, Biomarkers and Prevention
DNA extraction and amplification steps occurred in two batches (batch 1: 10 samples and batch 2: 28 samples; tumor-normal pairs from same patient kept together), but all samples were sequenced in the same ...
batch. ...
doi:10.1158/1055-9965.epi-18-0966
pmid:30733306
pmcid:PMC6449216
fatcat:lruu2gzz3vdefhrx6e7adwg3pi
Learning Representations that Support Extrapolation
[article]
2020
arXiv
pre-print
We also introduce a simple technique, temporal context normalization, that encourages representations that emphasize the relations between objects. ...
(RECURRENT)
100.0 ± 0.0
44.1 ± 5.0 28.1 ± 2.4 23.4 ± 1.6 20.2 ± 1.2 18.3 ± 0.8
BATCH NORM. ...
We performed TCN before passing the embeddings to the recurrent network, and then de-normalized the predictions made by the recurrent network. ...
arXiv:2007.05059v2
fatcat:h6mxldfbxbhbxmppddob6oox7y
AdaFilter: Adaptive Filter Fine-Tuning for Deep Transfer Learning
2020
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
We use a recurrent gated network to selectively fine-tune convolutional filters based on the activations of the previous layer. ...
We compare gated batch normalization (Gated BN) against the standard batch normalization (Standard BN). ...
In standard batch normalization, we use one batch normalization layer to normalize each channel across a mini-batch, Table 3 shows the results of the Gated BN and the standard BN on all the datasets. ...
doi:10.1609/aaai.v34i04.5824
fatcat:pcv2onzk4rdljlrs3k62b6bxqm
AdaFilter: Adaptive Filter Fine-tuning for Deep Transfer Learning
[article]
2019
arXiv
pre-print
We use a recurrent gated network to selectively fine-tune convolutional filters based on the activations of the previous layer. ...
We compare gated batch normalization (Gated BN) against the standard batch normalization (Standard BN). ...
In standard batch normalization, we use one batch normalization layer to normalize each channel across a mini-batch, Table 3 shows the results of the Gated BN and the standard BN on all the datasets. ...
arXiv:1911.09659v2
fatcat:54e6rypzpzd7jckz5w7baqonfe
Applying the Transformer to Character-level Transduction
[article]
2021
arXiv
pre-print
show that with a large enough batch size, the transformer does indeed outperform recurrent models. ...
Yet for character-level transduction tasks, e.g. morphological inflection generation and historical text normalization, there are few works that outperform recurrent models using the transformer. ...
Historical Text Normalization. ...
arXiv:2005.10213v2
fatcat:xghukun42nho5owxwrmrhot6cm
Recurrence-free unconstrained handwritten text recognition using gated fully convolutional network
[article]
2020
arXiv
pre-print
This is generally processed by deep recurrent neural networks and more specifically with the use of Long Short-Term Memory cells. ...
In this paper we present a Gated Fully Convolutional Network architecture that is a recurrence-free alternative to the well-known CNN+LSTM architectures. ...
Batch Normalization is the one with the worst CER. This can be explained by the small mini-batches used that lead to a slightly less stable Batch Normalization. ...
arXiv:2012.04961v1
fatcat:z7mx2kkavjhi5j23juxk4uqrnu
« Previous
Showing results 1 — 15 out of 69,945 results