Bi-perspective Splitting Defense: Achieving Clean-Seed-Free Backdoor Security

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: Bi-perspective Splitting Defense: Achieving Clean-Seed-Free Backdoor Security
Abstract: Backdoor attacks have seriously threatened deep neural networks (DNNs) by embedding concealed vulnerabilities through data poisoning. To counteract these attacks, training benign models from poisoned data garnered considerable interest from researchers. High-performing defenses often rely on additional clean subsets/seeds, which is untenable due to increasing privacy concerns and data scarcity. In the absence of additional clean subsets/seeds, defenders resort to complex feature extraction and analysis, resulting in excessive overhead and compromised performance. To address these challenges, we identify the key lies in sufficient utilization of both the easier-to-obtain target labels and clean hard samples. In this work, we propose a Bi-perspective Splitting Defense (BSD). BSD distinguishes clean samples using both semantic and loss statistics characteristics through open set recognition-based splitting (OSS) and altruistic model-based data splitting (ALS) respectively. Through extensive experiments on benchmark datasets and against representative attacks, we empirically demonstrate that BSD surpasses existing defenses by over 20\% in average Defense Effectiveness Rating (DER), achieving clean data-free backdoor security.
Lay Summary: Imagine if someone could secretly tamper with an AI system, like a hacker slipping a hidden backdoor into a lock. That’s what backdoor attacks do: they quietly sneak triggers into AI training data so the model only misbehaves under specific conditions. Defending against these attacks is tough, especially when extra clean data isn’t available due to privacy concerns or limited resources. Our research offers a new solution categorized into training-time defense. We introduce BSD, a defense that uses two perspectives to better spot poisoned data and improve the training result: how the AI understands each example, and how difficult it is for the AI to learn. These clues help distinguish clean data from poisoned data, without needing additional clean samples. BSD performs well in tests across multiple datasets and attack types. And the detailed design of BSD may inspire future research.
Primary Area: Deep Learning->Robustness
Keywords: Deep Learning, Backdoor Defense
Submission Number: 6691
Loading