Keywords: Communication Compression, Distributed Optimization, Unbiased Compression, Optimal Complexity
TL;DR: This paper explores when unbiased compression can reduce the total communication cost and how much it can do so
Abstract: Communication compression is a common technique in distributed optimization
that can alleviate communication overhead by transmitting compressed gradients
and model parameters. However, compression can introduce information distortion,
which slows down convergence and incurs more communication rounds to achieve
desired solutions. Given the trade-off between lower per-round communication
costs and additional rounds of communication, it is unclear whether communication
compression reduces the total communication cost.
This paper explores the conditions under which unbiased compression, a widely
used form of compression, can reduce the total communication cost, as well as the
extent to which it can do so. To this end, we present the first theoretical formulation
for characterizing the total communication cost in distributed optimization with
unbiased compressors. We demonstrate that unbiased compression alone does not
necessarily save the total communication cost, but this outcome can be achieved
if the compressors used by all workers are further assumed independent. We
establish lower bounds on the communication rounds required by algorithms using
independent unbiased compressors to minimize smooth convex functions and
show that these lower bounds are tight by refining the analysis for ADIANA.
Our results reveal that using independent unbiased compression can reduce the
total communication cost by a factor of up to $\Theta(\sqrt{\min\\{n,\kappa\\}})$ when all local
smoothness constants are constrained by a common upper bound, where $n$ is the
number of workers and $\kappa$ is the condition number of the functions being minimized.
These theoretical findings are supported by experimental results.
Supplementary Material: pdf
Submission Number: 5699
Loading