Variational Deep Learning via Implicit Regularization

Variational Deep Learning via Implicit Regularization

ICLR 2026 Conference Submission20903 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Implicit Regularization, Bayesian Deep Learning, Generalized Variational Inference, Implicit Bias of SGD

TL;DR: We demonstrate theoretically and empirically that one can exploit the implicit bias of SGD for variational inference in Bayesian neural networks.

Abstract: Modern deep learning models generalize remarkably well in-distribution, despite being overparametrized and trained with little to no explicit regularization. Instead, current theory credits implicit regularization imposed by the choice of architecture, hyperparameters and optimization procedure. However, deep neural networks can be surprisingly non-robust, resulting in overconfident predictions and poor out-of-distribution generalization. Bayesian deep learning addresses this via model averaging, but typically requires significant computational resources as well as carefully elicited priors to avoid overriding the benefits of implicit regularization. Instead, in this work, we propose to regularize variational neural networks solely by relying on the implicit bias of (stochastic) gradient descent. We theoretically characterize this inductive bias in overparametrized linear models as generalized variational inference and demonstrate the importance of the choice of parametrization. Empirically, our approach demonstrates strong in- and out-of-distribution performance without additional hyperparameter tuning and with minimal computational overhead.

Supplementary Material: zip

Primary Area: probabilistic methods (Bayesian methods, variational inference, sampling, UQ, etc.)

Submission Number: 20903

Loading