Explicit Regularization in Overparametrized Models via Noise Injection

Antonio Orvieto, Anant Raj, Hans Kersting, Francis R. Bach

Published: 2023, Last Modified: 01 Aug 2025AISTATS 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Injecting noise within gradient descent has several desirable features, such as smoothing and regularizing properties. In this paper, we investigate the effects of injecting noise before computing a gradient step. We demonstrate that small perturbations can induce explicit regularization for simple models based on the L1-norm, group L1-norms, or nuclear norms. However, when applied to overparametrized neural networks with large widths, we show that the same perturbations can cause variance explosion. To overcome this, we propose using independent layer-wise perturbations, which provably allow for explicit regularization without variance explosion. Our empirical results show that these small perturbations lead to improved generalization performance compared to vanilla gradient descent.