Improved denoising diffusion probabilistic models with efficient non-diagonal covariance modeling

TMLR Paper7891 Authors

11 Mar 2026 (modified: 30 May 2026)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: The sampling process of Denoising Diffusion Probabilistic Models (DDPMs) can be accelerated by leveraging second-order information in the form of approximations to the denoising posterior covariance. Previous attempts at using such information have used drastic (e.g. diagonal) simplifications of the covariance. These do not do justice to the peculiar statistical structure of natural images, which exhibit strong non-diagonal correlations between pixels and color channels, and a slow-decaying power-law frequency spectrum. Here, we develop a novel covariance model that captures these features. Our Kronecker-DCT (K-DCT) model uses a Kronecker-factored decomposition of inter-color covariances and spatial covariances modeled in the frequency domain using the Discrete Cosine Transform (DCT). The use of the DCT reduces the computational complexity from quadratic to log-linear, resulting in negligible computational and memory overhead in the sampling process. By learning K-DCT-structured amortizations of the denoising posterior covariance using pre-trained score models on CIFAR-10, Celeb-A, and ImageNet datasets, we show improved performance both in terms of FID and likelihoods compared to previous SOTA denoising samplers.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Alex_Wong2
Submission Number: 7891
Loading