Sparse and Transferable Universal Singular Vectors Attack

Published: 20 Sept 2024, Last Modified: 29 Sept 2024ICOMP PublicationEveryoneRevisionsBibTeXCC BY 4.0
Keywords: adversarial attacks, computer vision
Abstract: Mounting concerns about neural networks' safety and robustness call for a deeper understanding of models’ vulnerability and research in adversarial attacks. Motivated by this, we propose a novel attack that is highly efficient in terms of transferability. In contrast to the existing $(p, q)$-singular vectors approach, we focus on finding sparse singular vectors of Jacobian matrices of the hidden layers by employing the truncated power iteration method. We discovered that using resulting vectors as adversarial perturbations can effectively attack the original model and models with entirely different architectures, highlighting the importance of sparsity constraint for attack transferability. Moreover, we achieve results comparable to dense baselines while damaging less than 1% of pixels and utilizing only 256 samples for perturbation fitting. Our algorithm also admits higher attack magnitude without affecting the human ability to solve the task, and damaging 5% of pixels attains more than a 50% fooling rate on average across models. Finally, our findings demonstrate the vulnerability of state-of-the-art models to universal sparse attacks and highlight the importance of developing robust machine learning systems.
Submission Number: 13
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview