CDF Normalization for Controlling the Distribution of Hidden Layer ActivationsDownload PDF

Published: 18 Oct 2021, Last Modified: 05 May 2023ICBINB@NeurIPS2021 PosterReaders: Everyone
Keywords: Batch Normalization, Regularization, Gaussian
TL;DR: We go beyond Batch Normalization by estimating the actual CDFs of layer activations, and show that explicitly enforcing a Gaussian distribution is not effective.
Abstract: Batch Normalizaiton (BN) is a normalization method for deep neural networks that has been shown to accelerate training. While the effectiveness of BN is undisputed, the explanation of its effectiveness is still being studied. The original BN paper attributes the success of BN to reducing internal covariate shift, so we take this a step further and explicitly enforce a Gaussian distribution on hidden layer activations. This approach proves to be ineffective, demonstrating further that reducing internal covariate shift is not important for successful layer normalization.
Category: Negative result: I would like to share my insights and negative results on this topic with the community
1 Reply

Loading