Spectral Signatures of Memorization in Diffusion Models: A Multi-Scale Diagnostic Study

Published: 26 May 2026, Last Modified: 26 May 2026ICML 2026 FoGen Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: diffusion models, memorization, generalization, spectral analysis, stable rank, representation divergence, online diagnostics, early stopping, DDPM, U-Net bottleneck
TL;DR: Stable rank is a generation-free, online diagnostic for memorization in diffusion models; a simple threshold prevents 35 to 45 points of memorization with zero false positives.
Abstract: We present a systematic spectral analysis of the memorization–generalization transition in diffusion models, comparing weight-space and representation-space diagnostics across 18 DDPM training runs on CIFAR-10 spanning six dataset scales. We find that stable rank emerges as the most reliable weight-space signal, achieving strong cross-condition correlation with memorization (r = 0.754) and, critically, consistent within-run tracking (mean r = +0.426), making it suitable for online monitoring. In representation space, representation divergence (REPDIV) is highly sensitive to the evaluation noise level: at low noise (t* = 100, high SNR), it becomes the strongest cross-condition correlate (r = 0.793), while at moderate noise (t* = 500) it collapses due to noise domination. We demonstrate the practical utility of these findings via an early stopping rule based on stable rank, which prevents 27–34 percentage points of memorization in low-data regimes without false positives in high-data settings. Both diagnostics concentrate at the U-Net bottleneck, and training loss provides no within-run signal. Together, these results establish a generation-free spectral toolkit for monitoring memorization during training and identify evaluation noise level as a previously overlooked but critical methodological variable.
Submission Number: 59
Loading