Eidolon: Unleashing Stealthy Backdoor Pandemic by Infecting a Single Diffusion Model

Eidolon: Unleashing Stealthy Backdoor Pandemic by Infecting a Single Diffusion Model

ICLR 2026 Conference Submission14041 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Diffusion Model, Backdoor Attack

TL;DR: We introduce Eidolon, the first backdoor attack on diffusion model that stealthily causes a widespread backdoor pandemic by passively transferring the attack to downstream models through synthetic data generation

Abstract: The remarkable success of modern Deep Neural Networks (DNNs) can be primarily attributed to having access to compute resources and high-quality labeled data, which is often costly and challenging to acquire. Recently, text-to-image Diffusion Models (DMs) have emerged as powerful data generators to augment training datasets. Machine learning practitioners often utilize off-the-shelf third-party DMs for generating synthetic data without domain-specific expertise or adaptation. Such a practice leads to a novel and insidious threat: diffusion-model infected with a backdoor can effectively spread into a large number of downstream models, causing a backdoor pandemic. To achieve this for the first time, we propose Eidolon, designed and optimized to stealthily transfer the backdoor injected into a single diffusion model into virtually an infinite number of downstream models without any active attacker role in the downstream training tasks. Proposed Eidolon not only makes the attack stealthier and effective, it also enforces a strict threat model for injecting backdoor into the downstream model compared to conventional backdoor attacks. We propose four necessary tests that a successful backdoor attack on the diffusion model should pass to cause a backdoor pandemic. Our evaluation across a wide range of benchmark datasets and model architectures exhibits that only our attack successfully passes these tests, causing widespread pandemic across many downstream classifiers.

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 14041

Loading