Large-Scale Public Data Improves Differentially Private Image Generation Quality

Ruihan Wu; Chuan Guo; Kamalika Chaudhuri

Large-Scale Public Data Improves Differentially Private Image Generation Quality

Ruihan Wu, Chuan Guo, Kamalika Chaudhuri

15 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: societal considerations including fairness, safety, privacy

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Image Generation, Differential Privacy

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Abstract: Public data has been frequently used to improve the privacy-accuracy trade-off of differentially private machine learning, but prior work largely assumes that this data come from the same distribution as the private. In this work, we look at how to use *generic* large-scale public data to improve the quality of differentially private image generation in Generative Adversarial Networks (GANs), and provide an improved method that uses public data effectively. Our method works under the assumption that the support of the public data distribution contains the support of the private; an example of this is when the public data come from a general-purpose internet-scale image source, while the private data consist of images of a specific type. Detailed evaluations show that our method achieves SOTA in terms of FID score and other metrics compared with existing methods that use public data, and can generate high-quality, photo-realistic images in a differentially private manner.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 446

Loading