Mode Collapse in Variational Deep Gaussian Processes

Francisco Javier Sáez-Maldonado; Juan Maroñas; Daniel Hernández-Lobato

Mode Collapse in Variational Deep Gaussian Processes

Francisco Javier Sáez-Maldonado, Juan Maroñas, Daniel Hernández-Lobato

Published: 10 Oct 2024, Last Modified: 02 Dec 2024NeurIPS BDU Workshop 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Deep Gaussian Processes, Optimization, Mode Collapse, Whitening, Initialization

TL;DR: We analyze the possible causes of mode collapse (where, during training, the variational posterior becomes the prior distribution) in DGPs and propose a solution to avoid it.

Abstract: Deep Gaussian Processes (DGPs) define a hierarchical model capable of learning complex, non-stationary processes. Exact inference is intractable in DGPs, so a variational distribution is used in each layer. One of the main challenges when training DGPs is the prevention of a phenomenon known as mode collapse where, during training, the variational distribution becomes the prior distribution which is a minimizer of the KL-Divergence term in the ELBO. There are two main factors that influence the optimization process: the mean function of the inner GPs and the usage of the whitened representation of the variational distribution. In this work, we propose a data-driven initialization of the variational parameters that a) at initialization, predicts an already good approximation of the objective function, b) avoids mode collapse c) is supported by a theoretical analysis of the behavior of the KL divergence and experimental results in real-world datasets.

Submission Number: 53

Loading