Towards a Mechanistic Explanation of Diffusion Model Generalization

Matthew Niedoba; Berend Zwartsenberg; Kevin Patrick Murphy; Frank Wood

Towards a Mechanistic Explanation of Diffusion Model Generalization

Matthew Niedoba, Berend Zwartsenberg, Kevin Patrick Murphy, Frank Wood

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 spotlightposterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: Empirical patch-based denoisers approximate the generalization behaviour of image diffusion models

Abstract: We propose a simple, training-free mechanism which explains the generalization behaviour of diffusion models. By comparing pre-trained diffusion models to their theoretically optimal empirical counterparts, we identify a shared local inductive bias across a variety of network architectures. From this observation, we hypothesize that network denoisers generalize through localized denoising operations, as these operations approximate the training objective well over much of the training distribution. To validate our hypothesis, we introduce novel denoising algorithms which aggregate local empirical denoisers to replicate network behaviour. Comparing these algorithms to network denoisers across forward and reverse diffusion processes, our approach exhibits consistent visual similarity to neural network outputs, with lower mean squared error than previously proposed methods.

Lay Summary: Diffusion models are a popular type of AI model for generating images. Starting from random noise, they generate images in a step-by-step process by cleaning up the image through "denoising" operations that convert noisy images into clean ones using a neural network. Interestingly, past research has shown that even when these models are built in different ways, they tend to produce similar images if trained on the same data and given the same starting point. This suggests that new images are produced through some shared, underlying mechanism which is common to many image diffusion models. In this work, we investigate what this mechanism might be. In particular, we propose that image diffusion models use a denoising operation made up of a combination of denoising operations applied to smaller patches of the image. To test this hypothesis, we propose a simple approximation called PSPC which mimics the patch-based denoising behaviour, without any training. We find PSPC denoises images in a similar way to complex neural network-based diffusion models, supporting the idea that small, local operations can explain much of how diffusion models generate new images. This finding is an important step towards understanding how diffusion models work and could lead to diffusion models that are cheaper to run, more understandable, and more accountable.

Link To Code: https://github.com/plai-group/pspc

Primary Area: Deep Learning->Generative Models and Autoencoders

Keywords: Diffusion Models, Generalization, Diffusion, Generative Modelling

Submission Number: 8158

Loading