Abstract: We develop an asynchronous gradient method for training Machine Learning models with asynchronous distributed workers, each with its own communication and computation pace, and its own local data distribution. In the modern distributed machine learning training process, local data distribution across workers is often heterogeneous (a.k.a. client bias), which is a significant limiting factor in the analysis of most existing distributed asynchronous optimization methods. In this work, we propose AsyncBC, a distributed asynchronous variant of the SARAH method, and show that this is an effective Bias Correction mechanism for distributed asynchronous optimization. We show that AsyncBC can effectively manage arbitrary data heterogeneity, as well as handle gradient updates that arrive in an uncoordinated manner and with delays. As a byproduct of our analysis, we also provide a deeper understanding of the impacts of different stochasticity models on the convergence of the SARAH method.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Yingbin_Liang1
Submission Number: 4872
Loading