VAEs meet Diffusion Models: Efficient and High-Fidelity Generation

Kushagra Pandey; Avideep Mukherjee; Piyush Rai; Abhishek Kumar

VAEs meet Diffusion Models: Efficient and High-Fidelity Generation

Kushagra Pandey, Avideep Mukherjee, Piyush Rai, Abhishek Kumar

Published: 08 Dec 2021, Last Modified: 05 May 2023DGMs and Applications @ NeurIPS 2021 OralReaders: Everyone

Keywords: Deep Generative Models, VAE, diffusion models, explicit-likelihood models

TL;DR: A novel framework for unifying VAE's and diffusion models for efficient high fidelity image synthesis.

Abstract: Diffusion Probabilistic models have been shown to generate state-of-the-art results on several competitive image synthesis benchmarks but lack a low-dimensional, interpretable latent space, and are slow at generation. On the other hand, Variational Autoencoders (VAEs) have access to a low-dimensional latent space but, despite recent advances, exhibit poor sample quality. We present VAEDM, a novel generative framework for \textit{refining} VAE generated samples using diffusion models while also presenting a novel conditional forward process parameterization for diffusion models. We show that the resulting parameterization can improve upon the unconditional diffusion model in terms of sampling efficiency during inference while also equipping diffusion models with the low-dimensional VAE inferred latent code. Furthermore, we show that the proposed model exhibits out-of-the-box capabilities for downstream tasks like image superresolution and denoising.

1 Reply

Loading