ELBO, regularized maximum likelihood, and their common one-sample approximation for training stochastic neural networks

Sina Däubener; Simon Damm; Asja Fischer

ELBO, regularized maximum likelihood, and their common one-sample approximation for training stochastic neural networks

Sina Däubener, Simon Damm, Asja Fischer

Published: 07 May 2025, Last Modified: 28 Jul 2025UAI 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Stochastic/Bayesian Neural Networks, Evidence Lower Bound (ELBO), Jensen Gap, Prediction Variance

Abstract: Monte Carlo approximations are central to the training of stochastic neural networks in general, and Bayesian neural networks (BNNs) in particular. We observe that the common one-sample approximation of the standard training objective can be viewed both as maximizing the Evidence Lower Bound (ELBO) and as maximizing a regularized log-likelihood of a compound distribution. This latter approach differs from the ELBO only in the order of the logarithm and expectation, and is theoretically grounded in PAC-Bayes theory. We argue theoretically and demonstrate empirically that training with the regularized maximum likelihood increases prediction variance, enhancing performance in misspecified settings, adversarial robustness, and strengthening out-of-distribution (OOD) detection. Our findings help reconcile previous contradictions in the literature by providing a detailed analysis of how training objectives and Monte Carlo sample sizes affect uncertainty quantification in stochastic neural networks.

Latex Source Code: zip

Readers: auai.org/UAI/2025/Conference, auai.org/UAI/2025/Conference/Area_Chairs, auai.org/UAI/2025/Conference/Reviewers, auai.org/UAI/2025/Conference/Submission736/Authors, auai.org/UAI/2025/Conference/Submission736/Reproducibility_Reviewers

Submission Number: 736

Loading