Variational Diffusion Unlearning: a variational inference framework for unlearning in diffusion models
Keywords: safety, privacy in generative models
Abstract: For responsible and safe deployment of diffusion models in various domains, regulating the generated outputs from these models is desirable because such models could generate undesired violent and obscene outputs. To tackle this problem, one of the most popular methods is to use \emph{machine unlearning} methodology to forget training data points containing these undesired features from pre-trained generative models. Thus, the principal objective of this work is to propose a machine unlearning methodology that can prevent the generation of outputs containing undesired features from a pre-trained diffusion model. Our method termed as \underline{V}ariational \underline{D}iffusion \underline{U}nlearning (\textbf{VDU}) is a \textbf{one-step method} that \textit{only requires access to a subset of training data containing undesired features to forget}. Our approach is inspired by the variational inference method that minimizes a loss function consisting of two terms: \emph{plasticity inducer} and \emph{stability regularizer}. \emph{Plasticity inducer} reduces the log-likelihood of the undesired training data points while the \emph{stability regularizer}, essential for preventing loss of image sample quality, regularizes the model in parameter space. We validate the effectiveness of our method through comprehensive experiments, by forgetting data of certain user-defined classes from MNIST and CIFAR-10 datasets from a pre-trained unconditional denoising diffusion probabilistic model (DDPM).
Submission Number: 188
Loading