DAdaQuant: Doubly-adaptive quantization for communication-efficient Federated Learning

Robert Hönig; Yiren Zhao; Robert D. Mullins

DAdaQuant: Doubly-adaptive quantization for communication-efficient Federated Learning

Robert Hönig, Yiren Zhao, Robert D. Mullins

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone

Keywords: federated learning, gradient compression, quantization, communication efficiency

Abstract: Federated Learning (FL) is a powerful technique for training a model on a server with data from several clients in a privacy-preserving manner. In FL, a server sends the model to every client, who then train the model locally and send it back to the server. The server aggregates the updated models and repeats the process for several rounds. FL incurs significant communication costs, in particular when transmitting the updated local models from the clients back to the server. Recently proposed algorithms quantize the model parameters to efficiently compress FL communication. These algorithms typically have a quantization level that controls the compression factor. We find that dynamic adaptations of the quantization level can boost compression without sacrificing model quality. First, we introduce a time-adaptive quantization algorithm that increases the quantization level as training progresses. Second, we introduce a client-adaptive quantization algorithm that assigns each individual client the optimal quantization level at every round. Finally, we combine both algorithms into DAdaQuant, the doubly-adaptive quantization algorithm. Our experiments show that DAdaQuant consistently improves client$\rightarrow$server compression, outperforming the strongest non-adaptive baselines by up to $2.8\times$.

One-sentence Summary: We develop an algorithm that boosts the performance of quantization-based compression algorithms for Federated Learning.

Supplementary Material: zip

12 Replies

Loading