Efficient Unbiased Sparsification

Leighton Pate Barnes, Timothy Chow, Emma Cohen, Keith Frankston, Benjamin Howard, Fred Kochman, Daniel Scheinerman, Jeffrey M. Vanderkam

Published: 01 Jan 2024, Last Modified: 27 Sept 2025ISIT 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: An unbiased m-sparsification of a vector $p$ E Rn is a random vector $Q$ ∊ Rn with mean $p$ that has at most m < $n$ nonzero coordinates. Unbiased sparsification compresses the original vector without introducing bias; it arises in various contexts, such as in federated learning and sampling sparse probability distributions. Ideally, unbiased sparsification should also minimize the expected value of a divergence function Div(Q, p) that measures how far away $Q$ is from the original $p$. If $Q$ is optimal in this sense, then we call it efficient. Our main results describe efficient unbiased sparsifications for divergences that are either permutation-invariant or additively separable. Surprisingly, the characterization for permutation-invariant divergences is robust to the choice of divergence function, in the sense that our class of optimal $Q$ for squared Euclidean distance coincides with our class of optimal $Q$ for Kullback-Leibler divergence, or indeed any of a wide variety of divergences.

External IDs:dblp:conf/isit/BarnesCCFHKSV24