Instance-Level Smoothing for Enhanced Privacy in Deep Learning: Theoretical Insights and Empirical Validation

Shilin Zhang; YAN MING

Instance-Level Smoothing for Enhanced Privacy in Deep Learning: Theoretical Insights and Empirical Validation

Shilin Zhang, YAN MING

26 Sept 2024 (modified: 13 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: privacy preserving, adaptive kernel density estimation, medical image classification

TL;DR: Our paper introduces a novel framework for balancing privacy and utility in differentially private deep learning, demonstrating robust performance and theoretical foundation, validated on the CheXpert medical image dataset.

Abstract:

In this paper, we address the dual challenge of maintaining high accuracy and ensuring fairness in differentially private (DP) deep learning models. The optimization process is inherently complicated by the necessity of injecting random noise and limiting training iterations, particularly for over-parameterized models. Moreover, DP mechanisms frequently exacerbate accuracy disparities across subpopulations, complicating the balance between privacy and fairness. To tackle these challenges, we introduce a novel framework that systematically addresses the trade-off between privacy and utility in DP deep learning. At the core of our approach is the concept of instance-level smoothing, which enhances privacy protections without compromising performance. Our theoretical contributions include deep insights into sample complexity, instance-level smoothing factors, and error bounds required to achieve a given privacy budget. These insights provide a robust foundation for optimizing the delicate balance between privacy and utility. Our method demonstrates remarkable robustness, independent of iteration counts, model parameters, batch normalization processes, and subpopulation disparities. This flexibility enables an optimal balance between privacy preservation and utility, adaptable to a wide range of scenarios. Through extensive empirical studies on the large-scale medical imaging dataset CheXpert, we validate the effectiveness of our approach. Our findings align with theoretical predictions, showing that our method can effectively meet stringent privacy requirements while maintaining high performance. By bridging the gap between formal privacy guarantees and practical deep learning applications, our work lays the groundwork for future advancements in the field. This research empowers practitioners to protect sensitive data during model training and ensures both data privacy and model generality, paving the way for more secure and equitable AI systems.

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6841

Loading