DizzyRNN: Reparameterizing Recurrent Neural Networks for Norm-Preserving Backpropagation [article]

Victor Dorobantu, Per Andre Stromhaug, Jess Renteria
2016 arXiv   pre-print
The vanishing and exploding gradient problems are well-studied obstacles that make it difficult for recurrent neural networks to learn long-term time dependencies. We propose a reparameterization of standard recurrent neural networks to update linear transformations in a provably norm-preserving way through Givens rotations. Additionally, we use the absolute value function as an element-wise non-linearity to preserve the norm of backpropagated signals over the entire network. We show that this
more » ... eparameterization reduces the number of parameters and maintains the same algorithmic complexity as a standard recurrent neural network, while outperforming standard recurrent neural networks with orthogonal initializations and Long Short-Term Memory networks on the copy problem.
arXiv:1612.04035v1 fatcat:istwzreyhjhovkaoxsh66etf3a