Private and debiased model training: A fair differential privacy gradient framework

Private and debiased model training: A fair differential privacy gradient framework

ICLR 2026 Conference Submission17989 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Deep learning, differential privacy, disparate impact, fairness, stochastic gradient descent

Abstract: Deep learning models are vulnerable to leak private information about the training data. Differential privacy (DP) is increasingly implemented in deep learning to preserve the data privacy through different ways, one of which is imposing DP to the gradients in training models, called DP gradients. Unfortunately, adding DP to gradients has negative impacts on either robustness or fairness, and even both of deep learning models, resulting in unexpected performance of their data management tasks. In this paper, we undertake deep exploration of the disparate impact of DP gradients and their mitigating. Specifically, through empirical analysis we disclose that gradient variance renders clear disparate impact on different groups, and provide the theoretical proof on the relations between gradient variance and model fairness. Then we develop a Fair Differential Privacy Gradient (FDPG) framework to alleviate the disparate impact of DP gradients while protecting the data privacy. To implement the novel framework, we create a fairness-aware sampling mechanism to restore balance among groups, and design the adaptive noise injection strategy to maintain model utility. Our experimental evaluations demonstrate the effectiveness of FDPG on multiple mainstream classification tasks in both single and multiple protected group attributes.

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 17989

Loading