Abstract: Backdoor attacks pose significant security risks to deep neural networks (DNNs). These attacks involve models that make intentionally incorrect (and potentially targetsed) predictions on poisoned inputs containing carefully crafted triggers, while operating normally with clean inputs. Prior studies have investigated the invisibility of backdoor triggers to improve attack stealthiness. However, they primarily concentrate on achieving invisibility solely in the spatial domain, ignoring the generation of invisible triggers in the frequency and feature domains. This constraint makes the poisoned images vulnerable to detection by recent defense mechanisms. To tackle this problem, we introduce a Triple stealthy BAckdoor attack approach, termed TriBA, which simultaneously ensures the invisibility of triggers in all the spatial, frequency, and feature domains, to achieve desirable attack performance, while ensuring strong stealthiness. Specifically, we initially utilize Wavelet Transform to embed the high-frequency information from the trigger image into the clean image to ensure effective attack performance. Then, to achieve strong stealthiness across both spatial and frequency domains, we integrate Fourier Transform and Cosine Transform to blend the poisoned image and clean image in the frequency domain. Furthermore, TriBA adopts an attack strategy to make the backdoor features similar to clean features in the feature space, which guarantees trigger invisibility in the feature domain while maintaining attack effectiveness. We theoretically prove the effectiveness of this strategy. Finally, TriBA has been comprehensively evaluated on four datasets against popular image classifiers, demonstrating a marked improvement over existing state-of-the-art backdoor attacks in terms of both attack success rate and stealthiness.
External IDs:dblp:journals/tdsc/GaoCSLYWL25
Loading