VAEs meet Diffusion Models: Efficient and High-Fidelity GenerationDownload PDF

Published: 08 Dec 2021, Last Modified: 05 May 2023DGMs and Applications @ NeurIPS 2021 OralReaders: Everyone
Keywords: Deep Generative Models, VAE, diffusion models, explicit-likelihood models
TL;DR: A novel framework for unifying VAE's and diffusion models for efficient high fidelity image synthesis.
Abstract: Diffusion Probabilistic models have been shown to generate state-of-the-art results on several competitive image synthesis benchmarks but lack a low-dimensional, interpretable latent space, and are slow at generation. On the other hand, Variational Autoencoders (VAEs) have access to a low-dimensional latent space but, despite recent advances, exhibit poor sample quality. We present VAEDM, a novel generative framework for \textit{refining} VAE generated samples using diffusion models while also presenting a novel conditional forward process parameterization for diffusion models. We show that the resulting parameterization can improve upon the unconditional diffusion model in terms of sampling efficiency during inference while also equipping diffusion models with the low-dimensional VAE inferred latent code. Furthermore, we show that the proposed model exhibits out-of-the-box capabilities for downstream tasks like image superresolution and denoising.
1 Reply