Anti-Backdoor Coreset Selection via the Cumulative Entropy Criterion

Qi Zhao; Christian Wressnegger

Anti-Backdoor Coreset Selection via the Cumulative Entropy Criterion

Qi Zhao, Christian Wressnegger

17 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Anit-Backdoor Learning, Coreset Selection

TL;DR: We propose Cumulative Entropy as a novel data selection criterion that enables the extraction of a highly informative coreset for training-time backdoor defenses.

Abstract: Recent training-time defenses against neural backdoors isolate a benign subset from poisoned training data, to learn a backdoor-free model from it. In this paper, we formulate this defense strategy as a coreset selection problem, giving rise to so-called “Anti-Backdoor Coreset Selection.” Since poisonous samples have a) lower prediction uncertainty and are b) less frequent than benign samples, coreset selection naturally focuses more on samples associated with benign functionality than the backdoor functionality. We use the Cumulative Entropy as selection criterion to further facilitate this effect. The metric tracks the learning dynamics of training samples and allowing us to select benign samples with high informativeness for the coreset. Additionally, we unlearn the chosen samples in each epoch to facilitate the separability between benign and poisonous samples. Together, this yields an exceptionally effective training-time defense that constructs a benign coreset to train a backdoor-free model. Unlike prior defenses that compromise natural accuracy and fail against certain attacks, our method mitigates backdooring attacks consistently with a negligible impact on natural performance.

Supplementary Material: zip

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 8922

Loading