Low Sample and Communication Complexities in Decentralized Learning: A Triple Hybrid ApproachDownload PDFOpen Website

2021 (modified: 29 Jan 2023)INFOCOM 2021Readers: Everyone
Abstract: Network-consensus-based decentralized learning optimization algorithms have attracted a significant amount of attention in recent years due to their rapidly growing applications. However, most of the existing decentralized learning algorithms could not achieve low sample and communication complexities simultaneously - two important metrics in evaluating the trade-off between computation and communication costs of decentralized learning. To overcome these limitations, in this paper, we propose a triple hybrid decentralized stochastic gradient descent (TH-DSGD) algorithm for efficiently solving non-convex network-consensus optimization problems for decentralized learning. We show that to reach an ϵ <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> -stationary solution, the total sample complexity of TH-DSGD is O(ϵ <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">-3</sup> ) and the communication complexity is O(ϵ <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">-3</sup> ), both of which are independent of dataset sizes and significantly improve the sample and communication complexities of the existing works. We conduct extensive experiments with a variety of learning models to verify our theoretical findings. We also show that our TH-DSGD algorithm is stable as the network topology gets sparse and enjoys better convergence in the large-system regime.
0 Replies

Loading