In Search of Smooth Minima for Purifying Backdoor in Deep Neural Networks

Nazmul Karim; Abdullah Al Arafat; Umar Khalid; Zhishan Guo; Nazanin Rahnavard

In Search of Smooth Minima for Purifying Backdoor in Deep Neural Networks

Nazmul Karim, Abdullah Al Arafat, Umar Khalid, Zhishan Guo, Nazanin Rahnavard

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone

Keywords: AI Security, Backdoor or Trojan Attacks on Deep Networks, Safe and Robust AI

Abstract: The success of a deep neural network (DNN) heavily relies on the details of the training scheme; e.g., training data, architectures, hyper-parameters, etc. Recent backdoor attacks suggest that an adversary can take advantage of such training details and compromise the integrity of DNN. Our studies show that a backdoor model is usually optimized to a bad local minima, i.e., sharper minima as compared to a benign model. Intuitively, backdoor can be purified by re-optimizing the model to a smoother minima through fine-tuning with a few clean validation data. However, fine-tuning all DNN parameters often requires huge computational costs as well as sub-par clean test performance. To address this concern, we propose a novel backdoor purification technique—N atural G radient Fine-tuning (NGF)—which focuses on removing backdoor by fine-tuning only one layer. Specifically, NGF utilizes a loss surface geometry-aware optimizer that can successfully overcome the challenge of reaching a smooth minima under one-layer optimization scenario. To enhance the generalization performance of our proposed method, we introduce a clean data distribution-aware regularizer based on the knowledge of loss surface curvature matrix, i.e., Fisher Information Matrix. To validate the effectiveness of our method, we conduct extensive experimentation with four different datasets— CIFAR10, GTSRB, Tiny-ImageNet, and ImageNet; as well as 11 recent backdoor attacks, e.g., Blend, Dynamic, Clean Label, etc. NGF achieves state-of-the-art performance in most of these benchmarks.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning

22 Replies

Loading