Keywords: Text-to-video, Machine Unlearning, Concept Erasure, Diffusion Model
TL;DR: We propose T2VUnlearning, the first robust and precise method for unlearning specific concepts in text-to-video models.
Abstract: Recent advances in text-to-video (T2V) diffusion models have significantly enhanced the quality of generated videos. However, their capability to produce explicit or harmful content introduces new challenges related to misuse and potential rights violations. To address this newly emerging threat, we propose unlearning-based concept erasing as a solution. First, we adopt negatively-guided velocity prediction fine-tuning and enhance it with prompt augmentation to ensure robustness against prompts refined by large language models (LLMs). Second, to achieve precise unlearning, we incorporate mask-based localization regularization and concept preservation regularization to preserve the model's ability to generate non-target concepts. Extensive experiments demonstrate that our method effectively erases a specific concept while preserving the model's generation capability for all other concepts, outperforming existing methods.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 11837
Loading