Abstract: Privacy concerns grow with the success of modern deep learning models, especially when the training set contains sensitive data. Differentially private generative model (DPGM) can serve as a solution to circumvent such concerns by generating data that are distributionally similar to the original data yet with differential privacy (DP) guarantees. While GAN has attracted major attention, existing DPGMs based on flow generative models are limited and only developed on low-dimensional tabular datasets. The capability of exact density estimation makes the flow model exceptional when density estimation is of interest. In this work, we will first show that it is challenging (or even infeasible) to train a DP-flow via DP-SGD, i.e. the workhorse algorithm for private deep learning, on high-dimensional image sets with acceptable utility, and then we give an effective solution by reducing the generation from the pixel space to a lower dimensional latent space. We show the effectiveness and scalability of the proposed method via extensive experiments, where the proposed method achieves a significantly better privacy-utility trade-off compared to existing alternatives. Notably, our method is the first DPGM to scale to high-resolution image sets (up to 256 × 256). Our code is available at https://github.com/dihjiang/DP-LFlow.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: + We add comparisons with recent works on DP diffusion models in Sections 4.2 and 6.
+ We add an explicit limitation paragraph in Section 5 to discuss the scaling laws.
+ We add more explanations on BN challenge in Section 1.
+ We add a code link in the Abstract.
Code: https://github.com/dihjiang/DP-LFlow
Assigned Action Editor: ~Florian_Tramer1
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Number: 1254
Loading