MetaPoison:   Learning to craft adversarial poisoning examples via meta-learning

W. Ronny Huang; Jonas Geiping; Liam Fowl; Gavin Taylor; Tom Goldstein

MetaPoison: Learning to craft adversarial poisoning examples via meta-learning

W. Ronny Huang, Jonas Geiping, Liam Fowl, Gavin Taylor, Tom Goldstein

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Withdrawn SubmissionReaders: Everyone

TL;DR: Generate corrupted training images that are imperceptible yet change CNN behavior on a target during any new training.

Abstract: We consider a new class of \emph{data poisoning} attacks on neural networks, in which the attacker takes control of a model by making small perturbations to a subset of its training data. We formulate the task of finding poisons as a bi-level optimization problem, which can be solved using methods borrowed from the meta-learning community. Unlike previous poisoning strategies, the meta-poisoning can poison networks that are trained from scratch using an initialization unknown to the attacker and transfer across hyperparameters. Further we show that our attacks are more versatile: they can cause misclassification of the target image into an arbitrarily chosen class. Our results show above 50% attack success rate when poisoning just 3-10% of the training dataset.

Code: https://github.com/2350532677/metapoison

Keywords: Adversarial Examples, Poisoning, Backdoor Attacks, Deep Learning

Original Pdf: pdf

7 Replies

Loading