Correlation Aware Sparsified Mean Estimation Using Random Projection

Shuli Jiang; Pranay Sharma; Gauri Joshi

Correlation Aware Sparsified Mean Estimation Using Random Projection

Shuli Jiang, Pranay Sharma, Gauri Joshi

Published: 21 Sept 2023, Last Modified: 02 Nov 2023NeurIPS 2023 posterEveryoneRevisionsBibTeX

Keywords: distributed vector mean estimation, communication efficiency, cross-client correlation

TL;DR: Communication efficient sparsification based distributed vector mean estimation leveraging cross-client correlation information through random projection.

Abstract: We study the problem of communication-efficient distributed vector mean estimation, which is a commonly used subroutine in distributed optimization and Federated Learning (FL). Rand-$k$ sparsification is a commonly used technique to reduce communication cost, where each client sends $k < d$ of its coordinates to the server. However, Rand-$k$ is agnostic to any correlations, that might exist between clients in practical scenarios. The recently proposed Rand-$k$-Spatial estimator leverages the cross-client correlation information at the server to improve Rand-$k$'s performance. Yet, the performance of Rand-$k$-Spatial is suboptimal, and improving mean estimation is key to a faster convergence in distributed optimization. We propose the Rand-Proj-Spatial estimator with a more flexible encoding-decoding procedure, which generalizes the encoding of Rand-$k$ by projecting the client vectors to a random $k$-dimensional subspace. We utilize Subsampled Randomized Hadamard Transform (SRHT) as the projection matrix, and show that Rand-Proj-Spatial with SRHT outperforms Rand-$k$-Spatial, using the correlation information more efficiently. Furthermore, we propose an approach to incorporate varying degrees of correlation, and suggest a practical variant of Rand-Proj-Spatial when the correlation information is not available to the server. Finally, experiments on real-world distributed optimization tasks showcase the superior performance of Rand-Proj-Spatial compared to Rand-$k$-Spatial and other more sophisticated sparsification techniques.

Supplementary Material: pdf

Submission Number: 14023

Loading