Keywords: Differential privacy, Theory, Spars Vector Technique, Quantile
Abstract: In this work we consider the problem of differentially private computation of
quantiles for the data, especially the highest quantiles such as maximum, but
with an unbounded range for the dataset. We show that this can be done
efficiently through a simple invocation of $\texttt{AboveThreshold}$, a
subroutine that is iteratively called in the fundamental Sparse Vector
Technique, even when there is no upper bound on the data. In particular, we
show that this procedure can give more accurate and robust estimates on the
highest quantiles with applications towards clipping that is essential for
differentially private sum and mean estimation. In addition, we show how two
invocations can handle the fully unbounded data setting. Within our study, we
show that an improved analysis of $\texttt{AboveThreshold}$ can improve the
privacy guarantees for the widely used Sparse Vector Technique that is of
independent interest. We give a more general characterization of privacy loss
for $\texttt{AboveThreshold}$ which we immediately apply to our method for
improved privacy guarantees. Our algorithm only requires one $O(n)$ pass
through the data, which can be unsorted, and each subsequent query takes $O(1)$
time. We empirically compare our unbounded algorithm with the state-of-the-art
algorithms in the bounded setting. For inner quantiles, we find that our method
often performs better on non-synthetic datasets. For the maximal quantiles,
which we apply to differentially private sum computation, we find that our
method performs significantly better.
Supplementary Material: pdf
Submission Number: 1700
Loading