Keywords: federated averaging, compression, communication efficiency, signSGD
TL;DR: This work proposes the first federated averaging algorithm with sign-based compression.
Abstract: Federated learning is a promising privacy-preserving distributed learning paradigm but suffers from high communication cost when training large-scale machine learning models. Sign-based methods, such as SignSGD \citep{bernstein2018signsgd}, have been proposed as a biased gradient compression technique for reducing the communication cost. However, sign-based compression could diverge under heterogeneous data, which motivate developments of advanced techniques, such as the error-feedback method and stochastic sign-based compression, to fix this issue.
Nevertheless, these methods still suffer significantly slower convergence rate than uncompressed algorithms. Besides, none of them allow local multiple SGD updates like FedAvg \citep{mcmahan2017communication}. In this paper, we propose a novel noisy perturbation scheme with a general symmetric noise distribution for sign-based compression, which not only allows one to flexibly control the tradeoff between gradient bias and convergence performance, but also provides a unified viewpoint to existing sign-based methods. More importantly, we propose the very first sign-based FedAvg algorithm ($z$-SignFedAvg). Theoretically, we show that $z$-SignFedAvg achieves a faster convergence rate than existing sign-based methods and, under the uniformly distribtued noise, can even enjoy the same convergence rate as its uncompressed counterpart. Extensive experiments are conducted to demonstrate that our proposed $z$-SignFedAvg can achieve competitive empirical performance on real datasets.
Is Student: Yes
4 Replies
Loading