Keywords: Differential Privacy, Generative Models, Black-Box Membership Inference Attack, Quantitative Gaussianization, Loss Path Kernels
TL;DR: We show that DP–SGD stability and latent Gaussian randomness combine to give tighter, dimension-aware privacy guarantees for black-box generative models than worst-case DP accounting.
Abstract: Black-box differentially private generative models often appears more private than worst-case accounting suggests, leaving a gap between formal Differential Privacy (DP) budgets and the observed weakness of membership inference attacks. We address this gap from a test-centric $f$-DP perspective. On the training side, we show that Differentially Private Stochastic Gradient Descent (DP--SGD) provides function-level stability, which can be quantified through loss-path kernels rather than parameter proximity. On the sampling side, the high-dimensional latent randomness of modern generators yields approximate Gaussian behavior, enabling a clean reduction to Gaussian DP. Combining these ingredients gives an effective signal parameter with small slack. The resulting envelopes predict that black-box distinguishability decreases with dataset size and effective latent dimension, and grows only sublinearly across multiple releases, while leaving formal DP budgets unchanged. Simulations and empirical tests confirm these predictions and align with observed attack performance, suggesting that our framework offers a practical and conservative tool for auditing the privacy of DP-trained generative models.
Supplementary Material: zip
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 8904
Loading