# Anti-Backdoor Coreset Selection via the Cumulative Entropy


### Abstract
Recent training-time defenses against neural backdoors isolate a benign subset from poisoned training data, to learn a backdoor-free model from it. In this paper, we formulate this defense strategy as a coreset selection problem, giving rise to so-called “Anti-Backdoor Coreset Selection (ABCS).” Since poisoned samples are a) less informative and b) less frequent than benign samples, coreset selection naturally focuses more strongly on benign functionality than the backdoor functionality. We introduce a novel selection criterion, the Cumulative Entropy, to further facilitate this effect. The metric tracks the learning dynamics across training samples and chooses samples with a high informative value (relative to all clean samples) to be part of the coreset. Additionally, we unlearned the chosen samples in each epoch to avoid a “habituation effect” of informativeness. Together, this yields an exceptionally effective training-time defense that constructs a benign coreset to train a backdoor-free model. Unlike prior defenses that compromise natural accuracy and fail for at least one of the investigated attacks, our method mitigates backdooring attacks consistently and effectively with a negligible impact on natural performance.

### Establish Code Environment
```
conda create -n abcs python==3.8
conda activate abcs
pip install torch==1.12.0+cu116 torchvision==0.13.0+cu116 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu116
pip install -r requirements.txt
```

### Coreset Selection via ABCS
```
sh run_abcs_defense.sh
```

### Note
This code project is constructed based on [DeepCore](https://arxiv.org/pdf/2204.08499.pdf), a comprehensive libraray of coreset selection methods in deep learning.