Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Recurrent Normalization Propagation
César Laurent, Nicolas Ballas, Pascal Vincent
Nov 04, 2016 (modified: Jan 16, 2017)ICLR 2017 conference submissionreaders: everyone
Abstract:We propose a LSTM parametrization that preserves the means and variances of the hidden states and memory cells across time. While having training benefits similar to Recurrent Batch Normalization and Layer Normalization, it does not need to estimate statistics at each time step, therefore, requiring fewer computations overall. We also investigate the parametrization impact on the gradient flows and present a way of initializing the weights accordingly.
We evaluate our proposal on language modelling and image generative modelling tasks. We empirically show that it performs similarly or better than other recurrent normalization approaches, while being faster to execute.
TL;DR:Extension of Normalization Propagation to the LSTM.
Keywords:Deep learning, Optimization
Enter your feedback below and we'll get back to you as soon as possible.