Fast and Communication Efficient Decentralized Learning with Local Updates
Keywords: Optimization, Distributed learning, Decentralized learning
Abstract: Gossip and random walk-based learning are widely considered decentralized learning algorithms. Gossip algorithms (both synchronous and asynchronous) suffer from high communication cost, while random-walk based learning experiences high convergence time. In this paper, we design a fast and communication-efficient asynchronous decentralized learning mechanism DIGEST by taking advantage of both Gossip and random-walk ideas, and focusing on stochastic gradient descent (SGD). DIGEST is an asynchronous decentralized learning mechanism building on local-SGD, which is originally designed for communication efficient centralized learning. We analyze the convergence of DIGEST and prove that it approaches to the optimal solution asymptotically for both iid and non-iid data distributions. We evaluate the performance of DIGEST for logistic regression and a deep neural network ResNet20. The simulation results confirm that multi-stream DIGEST has nice convergence properties; its convergence time outperforms the baselines when data distribution is non-iid.
Submission Number: 85