Policy-Driven Attack: Learning to Query for Hard-label Black-box Adversarial ExamplesDownload PDF

Published: 12 Jan 2021, Last Modified: 05 May 2023ICLR 2021 PosterReaders: Everyone
Keywords: hard-label attack, black-box attack, adversarial attack, reinforcement learning
Abstract: To craft black-box adversarial examples, adversaries need to query the victim model and take proper advantage of its feedback. Existing black-box attacks generally suffer from high query complexity, especially when only the top-1 decision (i.e., the hard-label prediction) of the victim model is available. In this paper, we propose a novel hard-label black-box attack named Policy-Driven Attack, to reduce the query complexity. Our core idea is to learn promising search directions of the adversarial examples using a well-designed policy network in a novel reinforcement learning formulation, in which the queries become more sensible. Experimental results demonstrate that our method can significantly reduce the query complexity in comparison with existing state-of-the-art hard-label black-box attacks on various image classification benchmark datasets. Code and models for reproducing our results are available at https://github.com/ZiangYan/pda.pytorch
One-sentence Summary: A novel hard-label black-box adversarial attack that introduces a reinforcement learning based formulation with a pre-trained policy network
Supplementary Material: zip
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Data: [CIFAR-10](https://paperswithcode.com/dataset/cifar-10)
12 Replies