Filters








69,945 Hits in 3.3 sec

Recurrent Batch Normalization [article]

Tim Cooijmans, Nicolas Ballas, César Laurent, Çağlar Gülçehre, Aaron Courville
2017 arXiv   pre-print
We propose a reparameterization of LSTM that brings the benefits of batch normalization to recurrent neural networks.  ...  Whereas previous works only apply batch normalization to the input-to-hidden transformation of RNNs, we demonstrate that it is both possible and beneficial to batch-normalize the hidden-to-hidden transition  ...  We suspect that the previous difficulties with recurrent batch normalization reported in Laurent et al. (2016) ; Amodei et al. (2015) are largely due to improper initialization of the batch normalization  ... 
arXiv:1603.09025v5 fatcat:eyradinuvrfxbpo3cclkyi3vhy

Batch Normalized Recurrent Neural Networks [article]

César Laurent, Gabriel Pereyra, Philémon Brakel, Ying Zhang and Yoshua Bengio
2015 arXiv   pre-print
In particular, batch normalization, which uses mini-batch statistics to standardize features, was shown to significantly reduce training time.  ...  Recurrent Neural Networks (RNNs) are powerful models for sequential data that have the potential to learn long-term dependencies.  ...  Those results suggest that this way of applying batch normalization in the recurrent networks is not optimal. It seems that batch normalization hurts the training procedure.  ... 
arXiv:1510.01378v1 fatcat:lkzvghltmzhitedahionexsz2u

Batch-normalized recurrent highway networks

Chi Zhang, Thang Nguyen, Shagan Sah, Raymond Ptucha, Alexander Loui, Carl Salvaggio
2017 2017 IEEE International Conference on Image Processing (ICIP)  
In this work, batch normalized recurrent highway networks are proposed to control the gradient flow in an improved way for network convergence.  ...  Experimental results indicate that the batch normalized recurrent highway networks converge faster and performs better compared with the traditional LSTM and RHN based models.  ...  Fig. 1 . 1 The architecture of batch normalized recurrent neural networks.  ... 
doi:10.1109/icip.2017.8296359 dblp:conf/icip/ZhangNSPLS17 fatcat:dgrdgqdqqzajrfod7heefr45dq

Improving Gated Recurrent Unit Based Acoustic Modeling with Batch Normalization and Enlarged Context [article]

Jie Li, Yahui Shan, Xiaorui Wang, Yan Li
2018 arXiv   pre-print
Recently, we proposed a RNN model called minimal gated recurrent unit with input projection (mGRUIP), in which a context module namely temporal convolution, is specifically designed to model the future  ...  In the literature, batch normalization has been applied to RNN in different ways.  ...  This prompts us to use batch normalization to improve the convergence of optimization process.  ... 
arXiv:1811.10169v1 fatcat:eodop4tmkzhu7mzyvrkiejinru

Implementation of a batch normalized deep LSTM recurrent network on a smartphone for human activity recognition

Tahmina Zebin, Ertan Balaban, Krikor B. Ozanyan, Alexander J. Casson, Niels Peek
2019 2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI)  
In this paper we present a Long-Short Term Memory (LSTM) deep recurrent neural network (RNN) model for the classification of human daily life activities by using the accelerometer and gyroscope data of  ...  To make the network fast and robust we have employed dropout regularization and the recently introduced batch normalization method [13] .  ...  We have utilized the sequential model and with dense, conv1D, maxooling1D, dropout, and batch normalization layers. B. Model deployment, compilation and training C.  ... 
doi:10.1109/bhi.2019.8834480 dblp:conf/bhi/ZebinBOCP19 fatcat:yeaiu3pdg5guzier2tn53odhvu

Human activity recognition from inertial sensor time-series using batch normalized deep LSTM recurrent networks

Tahmina Zebin, Matthew Sperrin, Niels Peek, Alexander J. Casson
2018 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)  
Further, we show that this accuracy can be achieved with almost four times fewer training epochs by using a batch normalization approach.  ...  In this paper we present a Long-Short Term Memory (LSTM) deep recurrent neural network for the classification of six daily life activities from accelerometer and gyroscope data.  ...  Batch normalization has recently been introduced to overcome this by normalizing the x t and h t−1 activations going into each layer by applying a covariate shift.  ... 
doi:10.1109/embc.2018.8513115 pmid:30440301 fatcat:vlu6dej73jhznh27xbkocbxfey

Streaming Normalization: Towards Simpler and More Biologically-plausible Normalizations for Online and Recurrent Learning [article]

Qianli Liao, Kenji Kawaguchi, Tomaso Poggio
2016 arXiv   pre-print
We systematically explored a spectrum of normalization algorithms related to Batch Normalization (BN) and propose a generalized formulation that simultaneously solves two major limitations of BN: (1) online  ...  Unlike previous approaches, our technique can be applied out of the box to all learning scenarios (e.g., online learning, batch learning, fully-connected, convolutional, feedforward, recurrent and mixed  ...  Generalization to Recurrent Learning In this section, we generalize Sample Normalization, General Batch Normalization and Streaming Normalization to recurrent learning.  ... 
arXiv:1610.06160v1 fatcat:eg6vri5hfvgknb6buyqamoyvta

Investigation on the Combination of Batch Normalization and Dropout in BLSTM-based Acoustic Modeling for ASR

Li Wenjie, Gaofeng Cheng, Fengpei Ge, Pengyuan Zhang, Yonghong Yan
2018 Interspeech 2018  
In this paper, we explored some novel approaches to add batch normalization to the LSTM model in bidirectional mode.  ...  However, applying batch normalization in the LSTM model is more complicated and challenging than in the feed-forward network.  ...  People hold different views on how to use batch normalization on the recurrent neural network [9] [10].  ... 
doi:10.21437/interspeech.2018-1597 dblp:conf/interspeech/LiCGZ018 fatcat:es7g237ho5cujihrcbqegao3fa

Layer Normalization [article]

Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton
2016 arXiv   pre-print
However, the effect of batch normalization is dependent on the mini-batch size and it is not obvious how to apply it to recurrent neural networks.  ...  Layer normalization is very effective at stabilizing the hidden state dynamics in recurrent networks.  ...  ., 2016] suggests the best performance of recurrent batch normalization is obtained by keeping independent normalization statistics for each time-step.  ... 
arXiv:1607.06450v1 fatcat:w2kufqz6mfhdjb4okiyrpxyexe

The microbiome in lung cancer tissue and recurrence-free survival

Brandilyn A Peters, Richard B. Hayes, Chandra Goparaju, Christopher Reid, Harvey I Pass, Jiyoung Ahn
2019 Cancer Epidemiology, Biomarkers and Prevention  
DNA extraction and amplification steps occurred in two batches (batch 1: 10 samples and batch 2: 28 samples; tumor-normal pairs from same patient kept together), but all samples were sequenced in the same  ...  batch.  ... 
doi:10.1158/1055-9965.epi-18-0966 pmid:30733306 pmcid:PMC6449216 fatcat:lruu2gzz3vdefhrx6e7adwg3pi

Learning Representations that Support Extrapolation [article]

Taylor W. Webb, Zachary Dulberg, Steven M. Frankland, Alexander A. Petrov, Randall C. O'Reilly, Jonathan D. Cohen
2020 arXiv   pre-print
We also introduce a simple technique, temporal context normalization, that encourages representations that emphasize the relations between objects.  ...  (RECURRENT) 100.0 ± 0.0 44.1 ± 5.0 28.1 ± 2.4 23.4 ± 1.6 20.2 ± 1.2 18.3 ± 0.8 BATCH NORM.  ...  We performed TCN before passing the embeddings to the recurrent network, and then de-normalized the predictions made by the recurrent network.  ... 
arXiv:2007.05059v2 fatcat:h6mxldfbxbhbxmppddob6oox7y

AdaFilter: Adaptive Filter Fine-Tuning for Deep Transfer Learning

Yunhui Guo, Yandong Li, Liqiang Wang, Tajana Rosing
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
We use a recurrent gated network to selectively fine-tune convolutional filters based on the activations of the previous layer.  ...  We compare gated batch normalization (Gated BN) against the standard batch normalization (Standard BN).  ...  In standard batch normalization, we use one batch normalization layer to normalize each channel across a mini-batch, Table 3 shows the results of the Gated BN and the standard BN on all the datasets.  ... 
doi:10.1609/aaai.v34i04.5824 fatcat:pcv2onzk4rdljlrs3k62b6bxqm

AdaFilter: Adaptive Filter Fine-tuning for Deep Transfer Learning [article]

Yunhui Guo, Yandong Li, Liqiang Wang, Tajana Rosing
2019 arXiv   pre-print
We use a recurrent gated network to selectively fine-tune convolutional filters based on the activations of the previous layer.  ...  We compare gated batch normalization (Gated BN) against the standard batch normalization (Standard BN).  ...  In standard batch normalization, we use one batch normalization layer to normalize each channel across a mini-batch, Table 3 shows the results of the Gated BN and the standard BN on all the datasets.  ... 
arXiv:1911.09659v2 fatcat:54e6rypzpzd7jckz5w7baqonfe

Applying the Transformer to Character-level Transduction [article]

Shijie Wu, Ryan Cotterell, Mans Hulden
2021 arXiv   pre-print
show that with a large enough batch size, the transformer does indeed outperform recurrent models.  ...  Yet for character-level transduction tasks, e.g. morphological inflection generation and historical text normalization, there are few works that outperform recurrent models using the transformer.  ...  Historical Text Normalization.  ... 
arXiv:2005.10213v2 fatcat:xghukun42nho5owxwrmrhot6cm

Recurrence-free unconstrained handwritten text recognition using gated fully convolutional network [article]

Denis Coquenet, Clément Chatelain, Thierry Paquet
2020 arXiv   pre-print
This is generally processed by deep recurrent neural networks and more specifically with the use of Long Short-Term Memory cells.  ...  In this paper we present a Gated Fully Convolutional Network architecture that is a recurrence-free alternative to the well-known CNN+LSTM architectures.  ...  Batch Normalization is the one with the worst CER. This can be explained by the small mini-batches used that lead to a slightly less stable Batch Normalization.  ... 
arXiv:2012.04961v1 fatcat:z7mx2kkavjhi5j23juxk4uqrnu
« Previous Showing results 1 — 15 out of 69,945 results