Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks

Jimmy Z. Di; Jack Douglas; Jayadev Acharya; Gautam Kamath; Ayush Sekhari

Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks

Jimmy Z. Di, Jack Douglas, Jayadev Acharya, Gautam Kamath, Ayush Sekhari

Published: 21 Nov 2022, Last Modified: 26 May 2025TSRML2022Readers: Everyone

Keywords: Machine Unlearning, Poisoning Attack, Camouflaging Poisons

TL;DR: We show that machine unlearning can be used to implement a new type of camouflaged data poisoning attack.

Abstract: We introduce camouflaged data poisoning attacks, a new attack vector that arises in the context of machine unlearning and other settings when model retraining may be induced. An adversary first adds a few carefully crafted points to the training dataset such that the impact on the model's predictions is minimal. The adversary subsequently triggers a request to remove a subset of the introduced points at which point the attack is unleashed and the model's predictions are negatively affected. In particular, we consider clean-label targeted attacks (in which the goal is to cause the model to misclassify a specific test point) on datasets including CIFAR-10, Imagenette, and Imagewoof. This attack is realized by constructing camouflage datapoints that mask the effect of a poisoned dataset.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/hidden-poison-machine-unlearning-enables/code)

3 Replies

Loading