Keywords: differential privacy, quantile estimation
TL;DR: A new mechanism with more precise private quantile estimation in most realistic parameter ranges
Abstract: In the approximate quantiles problem, the goal is to output $m$ quantile estimates, the ranks of which are as close as possible to $m$ given quantiles $0 \leq q_1 \leq\dots \leq q_m \leq 1$.
We present a mechanism for approximate quantiles that satisfies $\varepsilon$-differential privacy for a dataset of $n$ real numbers where the ratio between the distance between the closest pair of points and the size of the domain is bounded by $\psi$.
As long as the minimum gap between quantiles is sufficiently large, $|q_i-q_{i-1}|\geq \Omega\left(\frac{m\log(m)\log(\psi)}{n\varepsilon}\right)$ for all $i$, the maximum rank error of our mechanism is $O\left(\frac{\log(\psi) + \log^2(m)}{\varepsilon}\right)$ with high probability.
Previously, the best known algorithm under pure DP was due to Kaplan, Schnapp, and Stemmer (ICML '22), who achieved a bound of $O\left(\frac{\log(\psi)\log^2(m) + \log^3(m)}{\varepsilon}\right)$.
Our improvement stems from the use of continual counting techniques which allows the quantiles to be randomized in a correlated manner.
We also present an $(\varepsilon,\delta)$-differentially private mechanism that relaxes the gap assumption without affecting the error bound, improving on existing methods when $\delta$ is sufficiently close to zero.
We provide experimental evaluation which confirms that our mechanism performs favorably compared to prior work in practice, in particular when the number of quantiles $m$ is large.
Supplementary Material: zip
Primary Area: Social and economic aspects of machine learning (e.g., fairness, interpretability, human-AI interaction, privacy, safety, strategic behavior)
Submission Number: 21075
Loading