Abstract: This paper focuses on implanting multiple heterogeneous backdoor triggers in bridge-based diffusion models designed for complex and arbitrary input distributions. Existing backdoor formulations mainly address single-attack scenarios and are limited to Gaussian noise input models. To fill this gap, we propose MixBridge, a novel diffusion Schrödinger bridge (DSB) framework to cater to arbitrary input distributions (taking I2I tasks as special cases). Beyond this trait, we demonstrate that backdoor triggers can be injected into MixBridge by directly training with poisoned image pairs. This eliminates the need for the cumbersome modifications to stochastic differential equations required in previous studies, providing a flexible tool to study backdoor behavior for bridge models. However, a key question arises: can a single DSB model train multiple backdoor triggers? Unfortunately, our theory shows that when attempting this, the model ends up following the geometric mean of benign and backdoored distributions, leading to performance conflict across backdoor tasks. To overcome this, we propose a Divide-and-Merge strategy to mix different bridges, where models are independently pre-trained for each specific objective (Divide) and then integrated into a unified model (Merge). In addition, a Weight Reallocation Scheme (WRS) is also designed to enhance the stealthiness of MixBridge. Empirical studies across diverse generation tasks speak to the efficacy of MixBridge. The code is available at: https://github.com/qsx830/MixBridge.
Lay Summary: We show how image-editing AI can be secretly hijacked to produce harmful content only when a hidden “trigger” is present, while behaving normally otherwise. Current attacks focus on models that generate images from random noise, but many real-world tools take existing images as input (e.g., photo upscalers or inpainting). In this work, we introduce MixBridge, a new framework that lets attackers embed multiple distinct triggers in these “image-to-image” systems. Instead of rewriting complex equations for each trigger, MixBridge simply trains on pairs of marked (poisoned) images and their intended malicious outputs. However, trying to teach one model all triggers at once leads to conflicts: it ends up blending benign and malicious behaviors and cannot satisfy either well. To solve this, we first train separate “expert” models for each task (clean editing or one of several backdoors), then merge them using a lightweight router that learns which expert to use based on subtle patterns in the input. A small “weight reallocation” penalty encourages the router to spread responsibility across experts, making backdoors harder to detect. Experiments show MixBridge can yield high-quality normal edits and, when the trigger is added, consistently produce different malicious outputs. We release our code so others can study and defend against these vulnerabilities.
Link To Code: https://github.com/qsx830/MixBridge
Primary Area: Social Aspects->Security
Keywords: Backdoor Attack, Schrödinger Bridge
Submission Number: 5787
Loading