FastNorm: Improving Numerical Stability of Deep Network Training with Efficient Normalization


Nov 07, 2017 (modified: Nov 07, 2017) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: We propose a modification to weight normalization techniques that provides the same convergence benefits but requires fewer computational operations. The proposed method, FastNorm, exploits the low-rank properties of weight updates and infers the norms without explicitly calculating them, replacing an $O(n^2)$ computation with an $O(n)$ one for a fully-connected layer. It improves numerical stability and reduces accuracy variance enabling higher learning rate and offering better convergence. We report experimental results that illustrate the advantage of the proposed method.
  • Keywords: Neural networks, Training, Convergence