Efficient Backdoor Mitigation in Federated Learning with Contrastive Loss

24 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: general machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Backdoor Defense; Federated Learning; Contrastive Loss
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Due to the data-driven nature of deep neural networks and privacy concerns around user data, a backdoor could be easily injected into deep neural networks in federated learning without attracting the attention of users. An affected global model operates normally as a clean model in regular tasks and behaves differently when the trigger is presented. In this paper, we propose a novel reverse engineering approach to detect and mitigate the backdoor attack in federated learning by adopting a self-supervised Contrastive learning loss. In contrast to existing reverse engineering techniques, such as Neural Cleanse, which involve iterating through each class in the dataset, we employ the contrastive loss as a whole to identify triggers in the backdoored model. Our method compares the last-layer feature outputs of a potentially affected model with these from a clean one preserved beforehand to reconstruct the trigger under the guidance of the contrastive loss. The reverse-engineered trigger is then applied to patch the affected global model to remove the backdoor. If the global model is free from backdoors, the Contrastive loss will lead to either a blank trigger or one with random pattern. We evaluated the proposed method on three datasets under two backdoor attacks and compared it against three existing defense methods. Our results showed that while many popular reverse engineering algorithms were successful in centralized learning settings, they had difficulties detecting backdoors in federated learning, including Neural Cleanse, TABOR, and DeepInspect. Our method successfully detected backdoors in federated learning and was more time-efficient.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8785
Loading