Semantic-Aware Multi-Label Adversarial Attacks

Hassan Mahmood, Ehsan Elhamifar

Published: 17 Jun 2024, Last Modified: 15 Sept 2024IEEE Conference on Computer Vision and Pattern Recognition (CVPR)EveryoneCC0 1.0

Abstract: Despite its importance, generating attacks for multilabel learning (MLL) models has received much less attention compared to multi-class recognition. Attacking an MLL model by optimizing a loss on the target set of labels has often the undesired consequence of changing the predictions for other labels. On the other hand, adding a loss on the remaining labels to keep them fixed leads to highly negatively correlated gradient directions, reducing the attack effectiveness. In this paper, we develop a framework for crafting effective and semantic-aware adversarial attacks for MLL. First, to obtain an attack that leads to semantically consistent predictions across all labels, we find a minimal superset of the target labels, referred to as consistent target set. To do so, we develop an efficient search algorithm over a knowledge graph, which encodes label dependencies. Next, we propose an optimization that searches for an attack that modifies the predictions of labels in the consistent target set while ensuring other labels will not get affected. This leads to an efficient algorithm that projects the gradient of the consistent target set loss onto the orthogonal direction of the gradient of the loss on other labels. Our framework can generate attacks on different target set sizes and for MLL with thousands of labels (as in OpenImages). Finally, by extensive experiments on three datasets and several MLL models, we show that our method generates both successful and semantically consistent attacks.