Dynamic Negative Guidance of Diffusion Models: Towards Immediate Content Removal

Felix Koulischer; Johannes Deleu; Gabriel Raya; Thomas Demeester; Luca Ambrogioni

Dynamic Negative Guidance of Diffusion Models: Towards Immediate Content Removal

Felix Koulischer, Johannes Deleu, Gabriel Raya, Thomas Demeester, Luca Ambrogioni

Published: 12 Oct 2024, Last Modified: 14 Nov 2024SafeGenAi PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Artificial Intelligence Safety, Classifier-Free Guidance, Negative Prompting

TL;DR: A novel theoretically grounded Dynamic Negative Guidance scheme is proposed as a temporary, but immediate solution to remove sensitive concepts, such as a celebrity's identity, from the outputs of a diffusion model.

Abstract: The rise of highly realistic large scale generative diffusion models comes hand in hand wih public safety concerns. In addition to the risk of generating *Not-Safe-For-Work* content from models trained on large internet-scraped datasets, there is a serious concern about reproducing copyrighted material, including celebrity images and artistic styles. We introduce ***D**ynamic **N**egative **G**uidance* a theoretically grounded negative guidance scheme that can avoid the generation of unwanted content without drastically harming the diversity of the model. Our approach avoids some of the disadvantages of the widespread, yet theoretically unfounded, Negative Prompting algorithm. Our guidance scheme does not require retraining the conditional model and can therefore be applied as a temporary solution to meet customer requests until model fine-tuning is possible.

Submission Number: 197

Loading