## $\texttt{DoStoVoQ}$: Doubly Stochastic Voronoi Vector Quantization SGD for Federated Learning

21 May 2021, 20:43 (modified: 04 Jun 2021, 19:55)NeurIPS 2021 SubmittedReaders: Everyone
Keywords: federated learning, distributed optimization, quantization, voronoi
TL;DR: We propose a novel gradient quantization scheme for distributed SGD which is based on random codebooks.
Abstract: The growing size of models and datasets have made distributed implementation of stochastic gradient descent (SGD) an active field of research. However the high bandwidth cost of communicating gradient updates between nodes remains a bottleneck; lossy compression is a way to alleviate this problem. We propose a new $\textit{unbiased}$ Vector Quantizer (VQ), named $\texttt{StoVoQ}$, to perform gradient quantization. This approach relies on introducing randomness within the quantization process, that is based on the use of unitarily invariant random codebooks and on a straightforward bias compensation method. The distortion of $\texttt{StoVoQ}$ significantly improves upon existing quantization algorithms. Next, we explain how to combine this quantization scheme within a Federated Learning framework for complex high-dimensional model (dimension $>10^6$), introducing $\texttt{DoStoVoQ}$. We provide theoretical guarantees on the quadratic error and (absence of) bias of the compressor, that allow to leverage strong theoretical results of convergence, e.g., with heterogeneous workers or variance reduction. Finally, we show that training on convex and non-convex deep learning problems, our method leads to significant reduction of bandwidth use while preserving model accuracy.
Supplementary Material: zip
Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.
15 Replies