Keywords: Differential Privacy, Second Moment Estimation, High-Dimensional Statistics, Subsampling, Private Learning
TL;DR: We propose a differentially private algorithm that approximates the second moment matrix of any subsamplable dataset, and show its applicability to heavy-tailed distributions and input with outliers.
Abstract: We study the problem of differentially private second moment estimation and present a new algorithm that achieve strong privacy-utility trade-offs even for worst-case inputs under subsamplability assumptions on the data.
We call an input $(m,\alpha,\beta)$-subsamplable if a random subsample of size $m$ (or larger) preserves w.p $\geq 1-\beta$ the spectral structure of the original second moment matrix up to a multiplicative factor of $1\pm \alpha$.
Building upon subsamplability, we give a recursive algorithmic framework similar to Kamath et al (2019) that abides zero-Concentrated Differential Privacy (zCDP) while preserving w.h.p the accuracy of the second moment estimation upto an arbitrary factor of $(1\pm\gamma)$.
We then show how to apply our algorithm to approximate the second moment matrix of a distribution $\mathcal{D}$, even when a noticeable fraction of the input are outliers.
Primary Area: Social and economic aspects of machine learning (e.g., fairness, interpretability, human-AI interaction, privacy, safety, strategic behavior)
Submission Number: 22677
Loading