Diagnosing and Enhancing VAE ModelsDownload PDF

Published: 21 Dec 2018, Last Modified: 22 Oct 2023ICLR 2019 Conference Blind SubmissionReaders: Everyone
Abstract: Although variational autoencoders (VAEs) represent a widely influential deep generative model, many aspects of the underlying energy function remain poorly understood. In particular, it is commonly believed that Gaussian encoder/decoder assumptions reduce the effectiveness of VAEs in generating realistic samples. In this regard, we rigorously analyze the VAE objective, differentiating situations where this belief is and is not actually true. We then leverage the corresponding insights to develop a simple VAE enhancement that requires no additional hyperparameters or sensitive tuning. Quantitatively, this proposal produces crisp samples and stable FID scores that are actually competitive with a variety of GAN models, all while retaining desirable attributes of the original VAE architecture. The code for our model is available at \url{https://github.com/daib13/TwoStageVAE}.
Keywords: variational autoencoder, generative models
TL;DR: We closely analyze the VAE objective function and draw novel conclusions that lead to simple enhancements.
Code: [![github](/images/github_icon.svg) daib13/TwoStageVAE](https://github.com/daib13/TwoStageVAE) + [![Papers with Code](/images/pwc_icon.svg) 3 community implementations](https://paperswithcode.com/paper/?openreview=B1e0X3C9tQ)
Data: [CIFAR-10](https://paperswithcode.com/dataset/cifar-10), [CelebA](https://paperswithcode.com/dataset/celeba), [Fashion-MNIST](https://paperswithcode.com/dataset/fashion-mnist), [MNIST](https://paperswithcode.com/dataset/mnist)
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 6 code implementations](https://www.catalyzex.com/paper/arxiv:1903.05789/code)
26 Replies

Loading