An Adaptive Stochastic Nesterov Accelerated Quasi Newton Method for Training RNNs [article]

S. Indrapriyadarsini, Shahrzad Mahboubi, Hiroshi Ninomiya, Hideki Asai
<span title="2019-09-09">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
A common problem in training neural networks is the vanishing and/or exploding gradient problem which is more prominently seen in training of Recurrent Neural Networks (RNNs). Thus several algorithms have been proposed for training RNNs. This paper proposes a novel adaptive stochastic Nesterov accelerated quasiNewton (aSNAQ) method for training RNNs. The proposed method aSNAQ is an accelerated method that uses the Nesterov's gradient term along with second order curvature information. The
more &raquo; ... mance of the proposed method is evaluated in Tensorflow on benchmark sequence modeling problems. The results show an improved performance while maintaining a low per-iteration cost and thus can be effectively used to train RNNs.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="">arXiv:1909.03620v1</a> <a target="_blank" rel="external noopener" href="">fatcat:sosfmq27q5hmxidgtomsqmmjke</a> </span>
<a target="_blank" rel="noopener" href="" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="" title=" access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> </button> </a>