Delayed Local-SGD for Distributed Learning with Linear Speedup

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: optimization
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Distributed Learning, Federated Learning, Distributed Optimization, Linear Speedup
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Local-SGD-based algorithms have gained much popularity in distributed learning to reduce the communication overhead, where each client conducts multiple localized iterations before communicating with the central server. However, since all participating clients are required to initiate iterations from the latest global model in each round of Local-SGD, the overall training process can be slowed down due to the straggler effect. To address this issue, we propose a Delayed Local-SGD (DLSGD) framework for distributed and federated learning with partial client participation. In DLSGD, each client performs local training starting from outdated models, regardless of whether it participates in the global aggregation. We investigate two types of DLSGD methods applied to scenarios where clients have identical or different local objective functions. Theoretical analyses demonstrate that DLSGD achieves asymptotic convergence rates that are on par with the classic Local-SGD methods for solving nonconvex problems, and guarantees linear speedup with respect to the number of participating clients. Additionally, we carry out numerical experiments using real datasets to validate the efficiency and scalability of our approach when training neural networks.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6781
Loading