Abstract: cating
with the parameter server to reduce communication overhead and improve overall
training efficiency. However, local updates also lead to the “client-drift” problem under
non-IID data, which avoids convergence to the exact optimal solution under heterogeneous
data distributions. To ensure accurate convergence, existing federated-learning algorithms
employ auxiliary variables to locally estimate the global gradient or the drift from the global
gradient, which, however, also incurs extra communication and storage overhead. In this
paper, we propose a new recursion-based federated-learning architecture that completely
eliminates the need for auxiliary variables while ensuring accurate convergence under heterogeneous
data distributions. This new federated-learning architecture, called FedRecu, can
significantly reduce communication and storage overhead compared with existing federatedlearning
algorithms with accurate convergence guarantees. More importantly, this novel architecture
enables FedRecu to employ much larger stepsizes than existing federated-learning
algorithms, thereby leading to much faster convergence. We provide rigorous convergence
analysis of FedRecu under both convex and nonconvex loss functions, in both the deterministic
gradient case and the stochastic gradient case. In fact, our theoretical analysis shows
that FedRecu ensures o(1/K) convergence to an accurate solution under general convex loss
functions, which improves upon the existing achievable O(1/K) convergence rate for general
convex loss functions, and which, to our knowledge, has not been reported in the literature
except for some restricted convex cases with additional constraints. Numerical experiments
on benchmark datasets confirm the effectiveness of the proposed algorithm.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Yi_Zhou2
Submission Number: 6528
Loading