Fast adversarial attacks to deep neural networks through gradual sparsification

Published: 01 Jan 2024, Last Modified: 16 May 2025Eng. Appl. Artif. Intell. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Deep learning networks, emerging machine learning models that present beyond human-level performance in terms of accuracy, are critically vulnerable to adversarial attacks. This vulnerability limits the utilization of deep learning architecture in many real-world safety-critical applications such as autonomous vehicles, medical diagnosis, sensitive industrial systems, etc. Adversarial attacks are methods to measure the robustness of different architectures and can be used to evaluate the suitability of a model to be used in a safety-critical situation. White-box sparse adversarial attacks can unhide interesting features of deep learning networks by identifying critical elements in the input pattern to design black-box attacks. This motivated us to design a new algorithmic procedure to design sparse adversarial attacks to feed-forward neural networks based on sparsity regularization. The proposed method comes with the gradual sparsification which starts by designing a dense attack and prone it until a desired level of sparsity is attained. We evaluate the performance of the proposed algorithm in designing attacks on convolutional neural networks and attention-based architectures for image classification task using three non-smooth sparsity-promoting regularizers. Compared to the state-of-the-art sparse attack schemes, we show that the proposed method can significantly decrease the time needed to design the attack, while the perturbation distortion is unchanged or reduced in some cases.
Loading