Keywords: variational autoencoders, encoder, optima, stability, low-dimensional manifold
Abstract: Variational Autoencoders (VAEs) are one of the most commonly used generative models, particularly for image data. A prominent difficulty in training VAEs is data that is supported on a lower dimensional manifold. Recent work by Dai and Wipf (2020) proposes a two-stage training algorithm for VAEs, based on a conjecture that in standard VAE training the generator will converge to a solution with 0 variance which is correctly supported on the ground truth manifold. They gave partial support for this conjecture by showing that some optima of the VAE loss do satisfy this property, but did not analyze the training dynamics. In this paper, we show that for linear encoders/decoders, the conjecture is true—that is the VAE training does recover a generator with support equal to the ground truth manifold—and does so due to an implicit bias of gradient descent rather than merely the VAE loss itself. In the nonlinear case, we show that VAE training frequently learns a higher-dimensional manifold which is a superset of the ground truth manifold.
One-sentence Summary: We analyze the landscape and implicit optimization bias of VAEs in the presence of low-dimensional data and the implications thereof.