Communication-Efficient Local SGD for Over-Parametrized Models with Partial Participation

Tiancheng Qin, Jayesh Yevale, S. Rasoul Etesami

Published: 01 Jan 2023, Last Modified: 21 Feb 2024CDC 2023Readers: Everyone

Abstract: We analyze the convergence rate of Local Stochastic Gradient Descent (SGD) for over-parameterized models, which is at the core of federated learning. In this model, we allow the server to randomly select a subset of agents and communicate with them at each communication round to optimize a global objective function. This captures the realistic scenarios where the communication link between the server and the agents may break down due to random link failures or adversarial attacks. We establish convergence guarantees for smooth objective functions without the convexity assumption that is the first for the regime. We also consider an extension of our results under a different random participation setting over general network structures (rather than a star network) in which an agent participates in the local optimization steps of its neighbors by some edge-dependent probability. We characterize the convergence rate of the proposed algorithm in terms of the number of communication rounds, which confirms the communication efficiency of our methods, and justify our results through a numerical experiment.

0 Replies