Regularity Normalization: Constraining Implicit Space with Minimum Description Length

Baihan Lin

Regularity Normalization: Constraining Implicit Space with Minimum Description Length

Baihan Lin

11 Mar 2019 (modified: 22 Jun 2025)Submitted to LLD 2019Readers: Everyone

Keywords: MDL, Universal code, LLD, Normalization, Biological plausibility, Unsupervised attention, Imbalanced data

TL;DR: Considering neural network optimization process as a model selection problem, we introduce a biological plausible normalization method that extracts statistical regularity under MDL principle to tackle imbalanced and limited data issue.

Abstract: Inspired by the adaptation phenomenon of biological neuronal firing, we propose regularity normalization: a reparameterization of the activation in the neural network that take into account the statistical regularity in the implicit space. By considering the neural network optimization process as a model selection problem, the implicit space is constrained by the normalizing factor, the minimum description length of the optimal universal code. We introduce an incremental version of computing this universal code as normalized maximum likelihood and demonstrated its flexibility to include data prior such as top-down attention and other oracle information and its compatibility to be incorporated into batch normalization and layer normalization. The preliminary results showed that the proposed method outperforms existing normalization methods in tackling the limited and imbalanced data from a non-stationary distribution benchmarked on computer vision task. As an unsupervised attention mechanism given input data, this biologically plausible normalization has the potential to deal with other complicated real-world scenarios as well as reinforcement learning setting where the rewards are sparse and non-uniform. Further research is proposed to discover these scenarios and explore the behaviors among different variants.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/regularity-normalization-constraining/code)

3 Replies

Loading