PoisonPatch: Natural Adversarial Patches via Diffusion Models and Federated Learning Poisoning

Yulong Wang, Yifei Fu, Wenwei Kong, Chang Liu, Sen Su

Published: 2025, Last Modified: 22 Jan 2026IEEE Trans. Inf. Forensics Secur. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Adversarial patches pose a significant threat to deep neural networks (DNNs). Unlike conventional adversarial attacks that are digital and less effective in the real world, adversarial patches can disrupt DNNs in real-world scenarios with potentially catastrophic outcomes. Understanding the characteristics of these patches is crucial to comprehending this new form of adversarial attack. While prior research has primarily aimed at enhancing the success rate of adversarial patches on specific DNN models, the rise of federated learning (FL) introduces a novel attack vector. In this context, attackers could manipulate a DNN model’s learnable parameters by contributing models trained on poisoned data. To assess the feasibility and danger of adversarial patches in this context, we introduce a novel attack method named PoisonPatch. This method merges FL poisoning attacks with adversarial patch search, first poisoning a DNN-based image classifier through FL, and then employing an adversarial patch search algorithm to create patches that increase the success rate of the attacks. This dual approach, combining poisoning attacks with adversarial patches, results in patches that are more challenging for machines to detect than traditional poisoning attacks and less noticeable to the human eye than typical adversarial patches due to their natural appearance. Our extensive experimental results demonstrate that PoisonPatch surpasses current state-of-the-art methods, producing natural-looking patches while achieving a 100% attack success rate.

External IDs:dblp:journals/tifs/WangFKLS25