# Bi-perspective Splitting Defense: Achieving Clean-Data-Free Backdoor Security
ICLR2025 Submission 

## Abstract

Backdoor attacks have seriously threatened deep neural networks (DNNs) by embedding concealed vulnerabilities through data poisoning. To counteract these attacks, training benign models from poisoned data garnered considerable interest from researchers. High-performing defenses often rely on additional clean subsets, which is untenable due to increasing privacy concerns and data scarcity. In the absence of clean subsets, defenders resort to complex feature extraction and analysis, resulting in excessive overhead and compromised performance. In the face of these challenges, we identify the key lies in sufficient utilization of the easier-to-obtain target labels and excavation of clean hard samples. In this work, we propose a Bi-perspective Splitting Defense (BSD). BSD splits the dataset using both semantic and loss statistics characteristics through open set recognition-based splitting (OSS) and altruistic model-based data splitting (ALS) respectively, achieving good clean pool initialization. BSD further introduces class completion and selective dropping strategies in the subsequent pool updates to avoid potential class underfitting and backdoor overfitting caused by loss-guided split. Through extensive experiments on 3 benchmark datasets and against 7 representative attacks, we empirically demonstrate that our BSD is robust across various attack settings. Specifically, BSD has an average improvement in Defense Effectiveness Rating (DER) by 16.29% compared to 5 state-of-the-art defenses, achieving clean-data-free backdoor security with minimal compromise in both Clean Accuracy (CA) and Attack Success Rate (ASR).

<div align="center" style="background-color: white; display: inline-block; padding: 10px;">
    <img src="assets/overview_v240926.png" width="800" alt="Pipeline of ASD"/><br/>
</div>

## Installation

This code is tested on our local environment (python=3.9, cuda=12), and we recommend you to use anaconda to create a vitural environment:

```bash
conda create -n BSD python=3.9
```
Then, activate the environment:
```bash
conda activate BSD
```

Install packages:
```bash
pip install -r requirements.txt
```

## Data Preparation

Please download CIFAR-10 dataset and extract it to `dataset_dir`
specified in the [YAML configuration file](./config/cifar_badnet.yaml).

## Backdoor Defense

Run the following command to train BSD under BadNets attack.

```shell
python BSD.py --config config/cifar_badnet.yaml --resume False --gpu 0
```
