Numerical Accounting in the Shuffle Model of Differential Privacy

Published: 07 Mar 2023, Last Modified: 02 Apr 2024Accepted by TMLREveryoneRevisionsBibTeX
Event Certifications: iclr.cc/ICLR/2024/Journal_Track
Abstract: Shuffle model of differential privacy is a novel distributed privacy model based on a combination of local privacy mechanisms and a secure shuffler. It has been shown that the additional randomisation provided by the shuffler improves privacy bounds compared to the purely local mechanisms. Accounting tight bounds, however, is complicated by the complexity brought by the shuffler. The recently proposed numerical techniques for evaluating $(\varepsilon,\delta)$-differential privacy guarantees have been shown to give tighter bounds than commonly used methods for compositions of various complex mechanisms. In this paper, we show how to utilise these numerical accountants for adaptive compositions of general $\varepsilon$-LDP shufflers and for shufflers of $k$-randomised response mechanisms, including their subsampled variants. This is enabled by an approximation that speeds up the evaluation of the corresponding privacy loss distribution from $\mathcal{O}(n^2)$ to $\mathcal{O}(n)$, where $n$ is the number of users, without noticeable change in the resulting $\delta(\varepsilon)$-upper bounds. We also demonstrate looseness of the existing bounds and methods found in the literature, improving previous composition results for shufflers significantly.
Certifications: Featured Certification
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: We thank the reviewers for constructive comments that have improved the manuscript. We have made a major revision, where we have replaced the 'worst-case' distribution given by Feldman et al. (2021) by the improved P and Q of Feldman et al. (2022) ("Stronger Privacy Amplification by Shuffling...."). For RDP to DP conversion, we have used the conversion formula by Canonne et al. (2022) (Figure 1 shows the difference between RDP and 'exact DP'). We have removed comparisons to analytic bounds and compare only to RDP results. We noticed ourselves an error that was not noticed by the reviewers. It was related to the comment of reviewer th4D about the dominating pairs for subsampling: We were originally using Proposition 30 of Zhu et al. (2022), however we realized that the result does not give a dominating pair (i.e. a pair of distributions that would dominate the alpha divergence for all alpha \geq 0) as the pairs are different for 0 \leq alpha<1 and alpha \geq 1. By taking maximum of these, as in the statement of Corollary 32 of Zhu et al. (2022), we, however, do get a privacy profile (convex function that has the properties of a privacy profile). To obtain a dominating pairs of distribution for this privacy profile, we used Algorithm 1 of Doroshenko et al. (2022) ("Connect the dots: Tighter discrete approximations of privacy loss distributions") and this way were able to correct our mistake. The resulting bounds are tighter than the Girgis et al. (2021) subsampling RDP bounds. As suggested by reviewer KLwm, we have changed the title of the paper to "Numerical Accounting in the Shuffle Model of Differential Privacy". EDIT: We have made another revision, where we have made two bigger changes: 1) We realized that the bounds we computed for the k-RR shufflers were actually not tight. To compute $\delta$ as a function of $\varepsilon$, we realized we were computing the tail bound of the privacy loss random variable, following the analysis of Balle et al. (2019, "The privacy blanket of the shuffle model"). This gives an upper bound for the tight $\delta$ which is given by the hockey-stick divergence. We replaced this k-RR accounting with the hockey-stick divergence - based numerical accounting, and focused on giving expressions for the PRV (which can be plugged in the numerical accountants). We modified the text and experiments correspondingly. 2) To determine the privacy loss random variable for k-RR under the assumptions of Balle et al. (2019), we realized that one claim by Balle et al. (2019) does not seem to be true: the variables $N_1$ and $N_2$ in the proof of Thm 3.1 of https://arxiv.org/pdf/1903.02837.pdf (page 10) cannot be assumed to be independent binomials, there is some correlation between them. We determined the correct form for the distribution of $(N_1,N_2)$ (certain double binomial, similar to analysis by Feldman et al. (2022)), and as a result obtained the PRVs for the k-RR shufflers (under the given assumptions about the adversary). EDIT 2: Few typos fixed. EDIT 3: Corrected typos, corrected few small errors in the code, updated figures 1 and 4. Updated the supplement codes.
Code: https://github.com/DPBayes/numerical-shuffler-experiments
Assigned Action Editor: ~Yu-Xiang_Wang1
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Number: 469
Loading