MAGIC: Diffusion Model Memorization Auditing via Generative Image Compression

Published: 11 Jun 2025, Last Modified: 14 Jul 2025MemFMEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Memorization, Diffusion Models, Compression
TL;DR: We show the brittleness of existing metrics for detecting memorization in diffusion models and introduce MAGIC, a robust method that reframes memorization into an image compression problem.
Abstract: Diffusion models have revolutionized generative modeling by producing high-fidelity images. However, concerns about $\textit{memorization}$—where models reproduce specific training images—pose ethical and legal challenges, especially regarding copyrighted content. In this paper, we critically analyze current memorization criteria, highlighting their brittleness due to reliance on specific caption-image pairs and vulnerability to common prompt modifications standard in industry, at both training and inference time. We propose a novel method for $\textbf{M}$emorization $\textbf{A}$uditing via $\textbf{G}$enerative $\textbf{I}$mage $\textbf{C}$ompression ($\texttt{MAGIC}$) that reframes memorization detection as an image compression problem. Specifically, we investigate whether the model can regenerate a particular image, independent of textual prompts. By compressing an image into a short learned conditioning (embedding), we directly measure how faithfully a diffusion model can reconstruct it. Experimentally, $\texttt{MAGIC}$ significantly improves robustness and accuracy (by over 20%) in detecting memorized content compared to existing approaches. $\texttt{MAGIC}$ thus enhances our understanding of memorization and provides practical tools for developing safer generative systems.
Submission Number: 21
Loading