Ignorance is not Bliss: A Novel Ensemble Method to Counter Adversarial Attacks on Deep Learning Models

Shubhajit Datta, Manaar Alam, Arijit Mondal, Debdeep Mukhopadhyay, Partha P Chakrabarti

Published: 18 Dec 2024, Last Modified: 17 Oct 2025CrossrefEveryoneRevisionsCC BY-SA 4.0
Abstract: Adversarial perturbations restrict deep learning models from being employed in high-security systems. Protecting these models from adversarial attacks is a challenging task and is drawing significant attention from researchers. One existing technique to defend deep neural networks against such attacks is using an ensemble of classifiers. However, the transferability nature of adversarial attacks hampers the effectiveness of ensemble methods. Hence, recent studies are focused on developing diversity among the members of the ensemble to deal with transferable adversarial attacks. In this paper, we study another aspect of diversity that leads members of an ensemble to learn diverse input features. It has been studied that a deep learning network does not learn all the significant features of inputs. To address this issue, we propose a novel method namely PIIP (Preventing Ignorance of Important Pixels) which helps a member of an ensemble to learn those input features which are ignored by other members of the ensemble. It helps to prevent the ignorance of important pixels of an input image as well as it develops the diversity among the members of the ensemble. As a result, the proposed method increases the robustness of deep learning models in the presence of adversarial attacks. We test our method on CIFAR-10 dataset as well as two real-world datasets: plantvillage and chest X-ray in the presence of state-of-the-art adversarial attack techniques. It is observed that PIIP significantly increases classification accuracy in the presence of different attack techniques without affecting the classification accuracy on clean examples. We also compare PIIP with existing ensemble methods and it is observed that the proposed method outperforms existing techniques in classifying adversarial examples. In particular, in the presence of a BIM (Basic Iterative Method) attack, the performance of PIIP improves by at least \(5\%\) compared to existing ensemble methods.
Loading