Abstract: Federated learning algorithms perform multiple local updates on clients before communicating with the parameter server to reduce communication overhead and improve overall training efficiency.
However, local updates also lead to the “client-drift” problem under
non-IID data, which avoids convergence to the exact optimal solution under heterogeneous
data distributions. To ensure accurate convergence, existing federated-learning algorithms
employ auxiliary variables to locally estimate the global gradient or the drift from the global
gradient, which, however, also incurs extra communication and storage overhead. In this
paper, we propose a new recursion-based federated-learning architecture that completely
eliminates the need for auxiliary variables while ensuring accurate convergence under het-
erogeneous data distributions. This new federated-learning architecture, called FedRecu, can
significantly reduce communication and storage overhead compared with existing federated-
learning algorithms with accurate convergence guarantees. More importantly, this novel ar-
chitecture enables FedRecu to employ much larger stepsizes than existing federated-learning
algorithms, thereby leading to much faster convergence. We provide rigorous convergence
analysis of FedRecu under both convex and nonconvex loss functions, in both the determin-
istic gradient case and the stochastic gradient case. In fact, our theoretical analysis shows
that FedRecu ensures o(1/K) convergence to an accurate solution under general convex loss
functions, which improves upon the existing achievable O(1/K) convergence rate for general
convex loss functions. Numerical experiments
on benchmark datasets confirm the effectiveness of the proposed algorithm
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Yi_Zhou2
Submission Number: 6528
Loading