Communication-Efficient Heterogeneous Federated Learning with Generalized Heavy-Ball Momentum

TMLR Paper4363 Authors

26 Feb 2025 (modified: 29 May 2025)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Federated Learning (FL) has emerged as the state-of-the-art approach for learning from decentralized data in privacy-constrained scenarios. However, system and statistical challenges hinder its real-world applicability, requiring efficient learning from edge devices and robustness to data heterogeneity. Despite significant research efforts, existing approaches often degrade severely due to the joint effect of heterogeneity and partial client participation. In particular, while momentum appears as a promising approach for overcoming statistical heterogeneity, in current approaches its update is biased towards the most recently sampled clients. As we show in this work, this is the reason why it fails to outperform FedAvg, preventing its effective use in real-world large-scale scenarios. In this work, we propose a novel _Generalized Heavy-Ball Momentum_ (GHBM) and theoretically prove it enables convergence under unbounded data heterogeneity in _cyclic partial participation_, thereby advancing the understanding of momentum's effectiveness in FL. We then introduce adaptive and communication-efficient variants of GHBM that match the communication complexity of FedAvg in settings where clients can be _stateful_. Extensive experiments on vision and language tasks confirm our theoretical findings, demonstrating that GHBM substantially improves state-of-the-art performance under random uniform client sampling, particularly in large-scale settings with high data heterogeneity and low client participation.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Tian_Li1
Submission Number: 4363
Loading