TL;DR: This paper shows the privacy amplification of the sign-based compressor when combined with deferentially private mechanisms
Abstract: The prevalent distributed machine learning paradigm faces two critical challenges: communication efficiency and data privacy. SIGNSGD provides a simple-to-implement approach with improved communication efficiency by requiring workers to share only the signs of the gradients. However, it fails to converge in the presence of data heterogeneity, and a simple fix is to add Gaussian noise before taking the signs, which leads to the Noisy SIGNSGD algorithm that enjoys competitive performance while significantly reducing the communication overhead. Existing results suggest that Noisy SIGNSGD with additive Gaussian noise has the same privacy guarantee as classic DP-SGD due to the post-processing property of differential privacy, and logistic noise may be a good alternative to Gaussian noise when combined with the sign-based compressor. Nonetheless, discarding the magnitudes in Noisy SIGNSGD leads to information loss, which may intuitively amplify privacy. In this paper, we make this intuition rigorous and quantify the privacy amplification of the sign-based compressor. Particularly, we analytically show that Gaussian noise leads to a smaller estimation error than logistic noise when combined with the sign-based compressor and may be more suitable for distributed learning with heterogeneous data. Then, we further establish the convergence of Noisy SIGNSGD. Finally, extensive experiments are conducted to validate the theoretical results.
Lay Summary: Training AI models across many devices faces two big hurdles: 1) Sending huge amounts of data back and forth is slow and expensive, and 2) keeping each user's private data safe is crucial. SIGNSGD tackles the communication problem by having devices share only whether the model adjustments (gradients) are positive or negative (the "sign") and reduces the communication overhead. However, it struggles when the data on different devices varies too much. To fix this convergence problem, existing works added Gaussian noise to the adjustments before taking the sign, which leads to Noisy SIGNSGD. While adding Gaussian noise is the de facto method for differential privacy, we theoretically show in this paper that discarding the actual size of the adjustments (the "magnitude") might boost privacy protection even further than expected.
Primary Area: Social Aspects->Privacy
Keywords: differential privacy, sign-based compressor
Submission Number: 11402
Loading