Selective Underfitting in Diffusion Models

Published: 23 Sept 2025, Last Modified: 23 Dec 2025SPIGM @ NeurIPSEveryoneRevisionsBibTeXCC BY 4.0
Keywords: diffusion models, generative models, generalization, scaling law
TL;DR: Diffusion models selectively underfit, which is key to their generalization and generative performance.
Abstract: Diffusion models have emerged as the principal paradigm for generative modeling across various domains. During training, they learn the score function, which in turn is used to generate samples at inference. They raise a basic yet unsolved question: *which* score do they actually learn? In principle, a diffusion model that matches the empirical score in the entire data space would simply reproduce the training data, failing to generate novel samples. Recent work addresses this paradox by arguing that diffusion models *underfit* the empirical score due to training-time inductive biases. In this paper, we show that this perspective is incomplete. Instead of underfitting the score everywhere, we show that better diffusion models more accurately approximate the score in certain regions of input space, while underfitting it in others. We characterize these regions and design empirical interventions to validate our perspective. Our results establish that this viewpoint, named *selective underfitting*, is essential for understanding diffusion models, yielding new, testable insights into their generalization and generative performance.
Submission Number: 63
Loading