Keywords: Denoising Diffusion Probabilistic Models, Covariance Modeling, Image Generation
TL;DR: An efficient non-diagonal covariance model for denoising posteriors improves statistical goodness-of-fit and quality of image generation in DDPMs.
Abstract: The sampling process of Denoising Diffusion Probabilistic Models (DDPMs) can be accelerated by leveraging second-order information in the form of approximations to the denoising posterior covariance.
Previous attempts at using such information have used drastic (e.g. diagonal) simplifications of the covariance.
These do not do justice to the peculiar statistical structure of natural images, which exhibit strong non-diagonal correlations between pixels and color channels, and a slow-decaying power-law frequency spectrum.
Here, we develop a novel covariance model that captures these features. Our Kronecker-DCT (K-DCT) model uses a Kronecker-factored decomposition of inter-color covariances and spatial covariances modeled in the frequency domain using the Discrete Cosine Transform (DCT).
The use of the DCT reduces the computational complexity from quadratic to log-linear, resulting in negligible computational and memory overhead in the sampling process.
By learning K-DCT-structured amortizations of the denoising posterior covariance using pre-trained score models on CIFAR-10, Celeb-A, and ImageNet datasets, we show improved performance both in terms of FID and likelihoods compared to previous SOTA denoising samplers.
Supplementary Material: zip
Primary Area: probabilistic methods (Bayesian methods, variational inference, sampling, UQ, etc.)
Submission Number: 17031
Loading