Communication-Efficient Heterogeneous Federated Learning with Generalized Heavy-Ball Momentum

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: general machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: federated learning, computer vision, machine learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: In recent years Federated Learning (FL) has emerged as the state-of-the-art approach for learning from decentralized data, thanks to its privacy-preserving and communication-efficiency characteristics. As the current literature reports, the main problems associated with FL refer to system and statistical challenges: former ones demand for efficient learning from edge devices, including lowering communication bandwidth and frequency, while the latter require algorithms robust to non-iidness. A principled way to address this issue relies on additional control variables to correct the local client's updates, but the convergence guarantees come at the cost of doubled communication cost. This motivates the need for a communication-efficient FL algorithm that robustly handles data heterogeneity. In this work we generalize the heavy-ball momentum to the FL scenario, effectively addressing the statistical heterogeneity without introducing any communication overhead. We conduct extensive experimentation on common FL vision and NLP datasets, showing that our FedHBM algorithm empirically yields better model quality and higher convergence speed w.r.t. the state-of-art, especially in pathological non-iid scenarios. Experiments in controlled small-scale scenarios are extended to large-scale real-world federated datasets, further corroborating the effectiveness of our approach for real-world FL applications. We additionally show how, while being designed for cross-silo settings, FedHBM is applicable in moderate-to-high cross-device scenarios, and how good model initializations (e.g. pre-training) can be exploited for prompt acceleration.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5505
Loading