Variance as a Catalyst: Efficient and Transferable Semantic Erasure Adversarial Attack for Customized Diffusion Models
Abstract: Latent Diffusion Models (LDMs) enable fine-tuning with only a few images and have become widely used on the Internet. However, it can also be misused to generate fake images, leading to privacy violations and social risks. Existing adversarial attack methods primarily introduce noise distortions to generated images but fail to completely erase identity semantics.
In this work, we identify the variance of VAE latent code as a key factor that influences image distortion. Specifically, larger variances result in stronger distortions and ultimately erase semantic information. Based on this finding, we propose a Laplace-based (LA) loss function that optimizes along the fastest variance growth direction, ensuring each optimization step is locally optimal. Additionally, we analyze the limitations of existing methods and reveal that their loss functions often fail to align gradient signs with the direction of variance growth. They also struggle to ensure efficient optimization under different variance distributions. To address these issues, we further propose a novel Lagrange Entropy-based (LE) loss function.
Experimental results demonstrate that our methods achieve state-of-the-art performance on CelebA-HQ and VGGFace2. Both proposed loss functions effectively lead diffusion models to generate pure-noise images with identity semantics completely erased. Furthermore, our methods exhibit strong transferability across diverse models and efficiently complete attacks with minimal computational resources. Our work provides a practical and efficient solution for privacy protection.
Lay Summary: Modern AI models can recreate a person’s face using just a few photos, raising serious privacy concerns. These models may be misused to generate fake images that resemble real people. We asked how to prevent the model from learning and replicating someone's identity in the first place. Our approach adds carefully crafted noise to the input image. These small changes are almost invisible to the human eye but are designed to disrupt how the model learns. By breaking the connection between the input image and the generated output, the AI model fails to capture identity-related features. As a result, it produces random, meaningless noise instead of a recognizable face. This method is fast, effective, and works across different AI models. It offers a practical way to protect individuals from the unauthorized use of their images and can also help artists defend their work from being copied by AI. Our work supports safer and more responsible use of generative AI technologies.
Link To Code: https://github.com/youyuanyi/variance-as-Catalyst
Primary Area: Social Aspects->Privacy
Keywords: Customized Latent Diffusion Models, Adversarial Attacks, Privacy Protection, Transferability, AI Security
Submission Number: 1524
Loading