TL;DR: Contrary to common belief, noise conditioning is not essential for denoising generative models, as most models perform well—or even better—without it, supported by theoretical analysis and a noise-unconditional model achieving competitive results.
Abstract: It is widely believed that noise conditioning is indispensable for denoising diffusion models to work successfully. This work challenges this belief. Motivated by research on blind image denoising, we investigate a variety of denoising-based generative models in the absence of noise conditioning. To our surprise, most models exhibit graceful degradation, and in some cases, they even perform better without noise conditioning. We provide a mathematical analysis of the error introduced by removing noise conditioning and demonstrate that our analysis aligns with empirical observations. We further introduce a noise-*unconditional* model that achieves a competitive FID of 2.23 on CIFAR-10, significantly narrowing the gap to leading noise-conditional models. We hope our findings will inspire the community to revisit the foundations and formulations of denoising generative models.
Lay Summary: Denoising diffusion models are powerful tools that can generate realistic images from random noise. Traditionally, these models rely heavily on a technique called noise conditioning, which means they are told how much noise is in the image at each step of the generation process. Until now, it was widely assumed that this information is essential for the model to work well.
In this study, the researchers challenge that assumption. Inspired by methods in blind image denoising—where the model doesn’t know how noisy an image is—they explore what happens when diffusion models are trained without any noise information. Surprisingly, many of these models still work quite well, and some even improve.
The authors also provide a mathematical explanation for why removing noise conditioning doesn’t hurt performance as much as expected. To support their findings, they introduce a new model that doesn’t use noise conditioning at all, yet achieves strong performance on a standard image generation benchmark (CIFAR-10), coming close to the best existing models.
This work opens up new possibilities for building simpler and potentially more robust image generation models by rethinking how noise is handled.
Primary Area: Deep Learning->Generative Models and Autoencoders
Keywords: noise conditioning, generative models, diffusion, score-based generative models, flow matching
Submission Number: 7307
Loading