Abstract: Federated reinforcement learning (FRL) enhances sample efficiency while preserving data privacy. However, standard FRL frameworks rely on aggregating model parameters or gradients, making them vulnerable to Byzantine attacks. Current Byzantine-resilient approaches primarily focus on server-side robust aggregations, leaving the fundamental vulnerability of transmitting parameters unaddressed. In this paper, we revisit Byzantine resilience in FRL from the knowledge distillation (KD) perspective. KD-based FRL uploads policy representations instead of policy parameters. This framework-level shift fundamentally constrains the attack surface. We theoretically prove traditional FRL suffers unbounded corruption from Byzantine agents, whereas KD-based FRL converges to an ${\mathcal{O}}(\alpha )$-stationary point under α-fraction adversaries, formalizing the accuracy-robustness trade-off. Empirical validation confirms the Byzantine resilience of KD-based FRL: it maintains near-optimal performance across diverse attacks and even withstands Byzantine fractions up to 0.9. Our theoretical guarantees and experiments demonstrate distillation endows FRL with fundamentally stronger resilience.
External IDs:dblp:conf/trustcom/JiangWZLZF25
Loading