On the Role of Data Homogeneity in Multi-Agent Non-convex Stochastic Optimization

Qiang LI, Hoi To Wai

Published: 15 Jul 2022, Last Modified: 07 Oct 2024OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: This paper studies the role of data homogeneity on multi-agent optimization. Concentrating on the decentralized stochastic gradient (DSGD) algorithm, we characterize the transient time, defined as the minimum number of iterations required such that DSGD can achieve the comparable performance as its centralized counterpart. When the Hessians for the objective functions are identical at different agents, we show that the transient time of DSGD is ${\cal O}(n^{4/3}/\rho^{8/3})$ for smooth (possibly non-convex) objective functions, where n is the number of agents and $\rho$ is the spectral gap of connectivity graph. This is improved over the bound of ${\cal O}(n^2/\rho^4)$ without the Hessian homogeneity assumption. Our analysis leverages a property that the objective function is twice continuously differentiable. Numerical experiments are presented to illustrate the essence of data homogeneity to fast convergence of DSGD.