FeatShield: Isolating Malicious Feature Extractors for Backdoor-Robust Federated Learning

Published: 26 Oct 2025, Last Modified: 28 Jan 2026ACMMM 2025EveryoneCC BY 4.0
Abstract: Federated learning remains vulnerable to backdoor attacks through malicious parameter updates, with existing defenses limited by homogeneous data assumptions or reliance on gradient anomaly detection. We reveal that FedAvg's critical flaw lies in malicious feature extractor propagation: aggregating poisoned extractors degrades defense accuracy to <70% across five benchmarks, while benign extractors with poisoned headers retain an average of 89.36% defense accuracy. Therefore, we propose FeatShield, a feature-space isolation framework that prevents backdoor propagation via non-aggregated local extractors trained on clean client data. FeatShield introduces 1) variance-aware alignment, adaptively balancing client-specific features and global consistency using local variance metrics, and 2) adversarial feature synthesis, generating non-linear synthetic features via GAN to enhance the global prediction header’s generalization on main tasks. Extensive experiments on eight real-world datasets show that FeatShield achieves the best defense performance. For instance, under heterogeneous data (Dirichlet $\beta$=0.5) and strong attacks (50% malicious clients), FeatShield achieves 99.26-99.89% defense accuracy and main task accuracy exceeding FedAvg by 1.32-5.70%, demonstrating its superior resistance to backdoor attacks without sacrificing the benign performance.
Loading