CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks

Shashank Agnihotri; Steffen Jung; Margret Keuper

CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks

Shashank Agnihotri, Steffen Jung, Margret Keuper

17 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: general machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: adversarial attacks, pgd, fgsm, cospgd, cosine similarity, semantic segmentation, optical flow, benchmarking tool, benchmark adversarial attack, lp norm, l-inf norm, l-2 norm

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: We introduce a new and efficient adversarial attack for pixel-wise predictions tasks that leverages the cosine similarity between the predictions and ground truth/target posteriors to extend directly from classification tasks to regression settings.

Abstract: While neural networks allow highly accurate predictions in many tasks, their lack of robustness towards even slight input perturbations hampers their deployment in many real-world applications. White-box adversarial attacks such as the seminal projected gradient descent (PGD) offer an effective means to evaluate the model robustness and dedicated solutions have been proposed for example for attacks on semantic segmentation or on optical flow. To streamline the evaluation process, we propose an efficient white-box adversarial attack, termed CosPGD, that can be applied to any pixel-wise prediction task in a unified setting. To this end, CosPGD employs a simple loss scaling based on the cosine similarity between the distributions over the predictions and ground truth (or target, for targeted attacks). This leads to efficient evaluations of a model's robustness for pixelwise classification as well as regression models, providing new insights into their performance at earlier attack stages. We outperform the SotA on semantic segmentation attacks in our experiments on PASCAL VOC2012 and CityScapes. Further, we showcase CosPGD's versatility by evaluating optical flow as well as image restoration models. We provide code for the CosPGD algorithm and example usage at https://anonymous.4open.science/r/cospgd-iclr2024-909/.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 909

Loading