Keywords: Machine unlearning, diffusion model, fake unlearning
Abstract: Diffusion models (DMs) have demonstrated remarkable generative capabilities in image generation but also pose privacy and copyright risks by memorizing and exposing training images. This concern is heightened by privacy regulations such as GDPR, which grant individuals the right to request the deletion of their data from AI models.
Machine unlearning (MU) has been proposed to address this issue, as it enables the selective removal of specific training data from AI models. However, most existing MU methods for DMs primarily focus on unlearning at the class level—either by removing entire classes of data or class-specific features. In contrast, sample-level machine unlearning (SLMU), which targets the removal of individual training samples, remains an underexplored area. SISS is the pioneering work on SLMU for DMs. However, after careful investigation, we find that the evaluation metric used in SISS does not adequately assess unlearning performance. Moreover, under our proposed evaluation framework, SISS cannot achieve complete unlearning and presents significant degradation in generative performance.
In this paper, we first define the objective of SLMU for DMs.
Building on this definition, we introduce a quantitative evaluation framework for constructing benchmarks that compare different methods. Using this framework, we are the first to identify the fake unlearning phenomenon.
Additionally, we propose a novel Sample-Level Machine Unlearning approach for Diffusion models, termed SMUD.
SMUD alters the generative path of the targeted images, leading the DM to generate different images.
Quantitative experimental results against baselines demonstrate that the proposed SMUD is the only method that can achieve SLMU without fake unlearning for both unconditional and conditional DMs.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 16168
Loading