FastNorm: Improving Numerical Stability of Deep Network Training with Efficient Normalization

Sadhika Malladi; Ilya Sharapov

FastNorm: Improving Numerical Stability of Deep Network Training with Efficient Normalization

Sadhika Malladi, Ilya Sharapov

20 Jan 2018 (modified: 25 Jan 2018)ICLR 2018 Conference Withdrawn SubmissionReaders: Everyone

Abstract: We propose a modification to weight normalization techniques that provides the same convergence benefits but requires fewer computational operations. The proposed method, FastNorm, exploits the low-rank properties of weight updates and infers the norms without explicitly calculating them, replacing an $O(n^2)$ computation with an $O(n)$ one for a fully-connected layer. It improves numerical stability and reduces accuracy variance enabling higher learning rate and offering better convergence. We report experimental results that illustrate the advantage of the proposed method.

Keywords: Neural networks, Training, Convergence

5 Replies

Loading