BagFlip: A Certified Defense Against Data Poisoning

Yuhao Zhang; Aws Albarghouthi; Loris D'Antoni

BagFlip: A Certified Defense Against Data Poisoning

Yuhao Zhang, Aws Albarghouthi, Loris D'Antoni

Published: 31 Oct 2022, Last Modified: 06 Apr 2025NeurIPS 2022 AcceptReaders: Everyone

Keywords: data poisoning, certified defense, backdoor attack, trigger-less attack

TL;DR: We present BagFlip, a model-agnostic certified approach that can effectively defend against both trigger-less and backdoor attack.

Abstract: Machine learning models are vulnerable to data-poisoning attacks, in which an attacker maliciously modifies the training set to change the prediction of a learned model. In a trigger-less attack, the attacker can modify the training set but not the test inputs, while in a backdoor attack the attacker can also modify test inputs. Existing model-agnostic defense approaches either cannot handle backdoor attacks or do not provide effective certificates (i.e., a proof of a defense). We present BagFlip, a model-agnostic certified approach that can effectively defend against both trigger-less and backdoor attacks. We evaluate BagFlip on image classification and malware detection datasets. BagFlip is equal to or more effective than the state-of-the-art approaches for trigger-less attacks and more effective than the state-of-the-art approaches for backdoor attacks.

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/bagflip-a-certified-defense-against-data/code)

12 Replies

Loading