Stable Optimization of Gaussian Likelihoods

Denis Megerle; Fabian Otto; Michael Volpp; Gerhard Neumann

Stable Optimization of Gaussian Likelihoods

Denis Megerle, Fabian Otto, Michael Volpp, Gerhard Neumann

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone

Keywords: Gaussian, Likelihood, Optimization, Uncertainty, Stabilization

TL;DR: We analyze the instability of Gaussian likelihood optimization and propose a gradient-based optimizer demonstrating less volatile optimization especially for contextual, multivariate target distributions with full covariances.

Abstract: Uncertainty-aware modeling has emerged as a key component in modern machine learning frameworks. The de-facto standard approach adopts heteroscedastic Gaussian distributions and minimizes the negative log-likelihood (NLL) under observed data. However, optimizing this objective turns out to be surprisingly intricate, and the current state-of-the-art reports several instabilities. This work breaks down the optimization problem, initially focusing on non-contextual settings where convergence can be analyzed analytically. We show that (1) in this learning scheme, the eigenvalues of the predictive covariance define stability in learning, and (2) coupling of gradients and predictions build up errors in both mean and covariance if either is poorly approximated. Building on these insights, we propose Trustable, a novel optimizer that overcomes instabilities methodically by combining systematic update restrictions in the form of trust regions with structured, tractable natural gradients. We demonstrate in several challenging experiments that Trustable outperforms current optimizers in regression with neural networks in terms of the NLL, MSE, and further performance metrics. Unlike other optimizers, Trustable yields an improved and more stable fit and can also be applied to multivariate outputs with full covariance matrices.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Probabilistic Methods (eg, variational inference, causal inference, Gaussian processes)

5 Replies

Loading