Keywords: federated learning, gradient compression, quantization, communication efficiency
Abstract: Federated Learning (FL) is a powerful technique for training a model on a
server with data from several clients in a privacy-preserving manner. In FL,
a server sends the model to every client, who then train the model locally
and send it back to the server. The server aggregates the updated models and
repeats the process for several rounds. FL incurs significant communication
costs, in particular when transmitting the updated local models from the
clients back to the server. Recently proposed algorithms quantize the model
parameters to efficiently compress FL communication. These algorithms
typically have a quantization level that controls the compression factor. We
find that dynamic adaptations of the quantization level can boost
compression without sacrificing model quality. First, we introduce a
time-adaptive quantization algorithm that increases the quantization level
as training progresses. Second, we introduce a client-adaptive quantization
algorithm that assigns each individual client the optimal quantization level
at every round. Finally, we combine both algorithms into DAdaQuant, the
doubly-adaptive quantization algorithm. Our experiments show that DAdaQuant
consistently improves client$\rightarrow$server compression, outperforming
the strongest non-adaptive baselines by up to $2.8\times$.
One-sentence Summary: We develop an algorithm that boosts the performance of quantization-based compression algorithms for Federated Learning.
Supplementary Material: zip
12 Replies
Loading