Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: federated learning, momentum, data heterogeneity, non-convex optimization
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We develop momentum variants of FedAvg and SCAFFOLD that attain SOTA (or even better) convergence under weaker assumptions.
Abstract: Federated learning is a powerful paradigm for large-scale machine learning, but it
faces significant challenges due to unreliable network connections, slow commu-
nication, and substantial data heterogeneity across clients. FedAvg and SCAFFOLD are two prominent algorithms to address these challenges. In particular,
FedAvg employs multiple local updates before communicating with a central
server, while SCAFFOLD maintains a control variable on each client to compen-
sate for “client drift” in its local updates. Various methods have been proposed
to enhance the convergence of these two algorithms, but they either make imprac-
tical adjustments to algorithmic structure, or rely on the assumption of bounded
data heterogeneity. This paper explores the utilization of momentum to enhance
the performance of FedAvg and SCAFFOLD. When all clients participate in the
training process, we demonstrate that incorporating momentum allows FedAvg
to converge without relying on the assumption of bounded data heterogeneity even
using a constant local learning rate. This is novel and fairly suprising as existing
analyses for FedAvg require bounded data heterogeneity even with diminishing
local learning rates. In partial client participation, we show that momentum en-
ables SCAFFOLD to converge provably faster without imposing any additional
assumptions. Furthermore, we use momentum to develop new variance-reduced
extensions of FedAvg and SCAFFOLD, which exhibit state-of-the-art conver-
gence rates. Our experimental results support all theoretical findings.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Primary Area: optimization
Submission Number: 1089
Loading