TriGuardFL: Triple-Step Byzantine-Robust Federated Learning against Model Poisoning Attacks

15 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Federated Learning; Byzantine-Robust
Abstract: Federated learning's (FL) distributed architecture is promising, yet it is vulnerable to model poisoning attacks that degrade global model accuracy. Existing defense strategies typically compare the locally updated gradients of clients and exclude or down-weight those exhibiting substantial deviations. However, these strategies may become ineffective when the clients’ datasets are heterogeneous. In this paper, we propose TriGuardFL, a novel triple-step defense framework that robustly discriminates malicious actors from benign and non-IID clients. First, we employ a cosine-similarity-based filter to identify suspicious clients. Second, a fine-grained secondary evaluation assesses their performance using a small class-stratified dataset. By analyzing class-wise performance differences, it can discern whether a divergent update stems from a malicious attack or data heterogeneity. Finally, a Bayesian reputation model is integrated to manage the uncertainty of detection and enhance the long-term robustness. Extensive case studies on two benchmark datasets and three representative model poisoning attacks demonstrate that TriGuardFL outperforms existing methods in mitigating the impact of model poisoning attacks.
Supplementary Material: zip
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 6374
Loading