Keywords: Decentralized Learning, Distributed Optimization, Communication Efficient Learning, Local SGD, Federated Learning
Abstract: Decentralized learning advocates the elimination of centralized parameter servers
(aggregation points) for potentially better utilization of underlying resources, de-
lay reduction, and resiliency against parameter server unavailability and catas-
trophic failures. Gossip based decentralized algorithms, where each node in a net-
work has its own locally kept model on which it effectuates the learning by talking
to its neighbors, received a lot of attention recently. Despite their potential, Gossip
algorithms introduce huge communication costs. In this work, we show that nodes
do not need to communicate as frequently as in Gossip for fast convergence; in
fact, a sporadic exchange of a digest of a trained model is sufficient. Thus, we
design a fast and communication-efficient decentralized learning mechanism; DI-
GEST by particularly focusing on stochastic gradient descent (SGD). DIGEST is
a decentralized algorithm building on local-SGD algorithms, which are originally
designed for communication efficient centralized learning. We show through anal-
ysis and experiments that DIGEST significantly reduces the communication cost
without hurting convergence time for both iid and non-iid data.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Optimization (eg, convex and non-convex optimization)
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/digest-fast-and-communication-efficient/code)
8 Replies
Loading