Binary Classification under Local Label Differential Privacy Using Randomized Response Mechanisms

Published: 01 Nov 2023, Last Modified: 01 Nov 2023Accepted by TMLREveryoneRevisionsBibTeX
Abstract: Label differential privacy is a popular branch of $\epsilon$-differential privacy for protecting labels in training datasets with non-private features. In this paper, we study the generalization performance of a binary classifier trained on a dataset privatized under the label differential privacy achieved by the randomized response mechanism. Particularly, we establish minimax lower bounds for the excess risks of the deep neural network plug-in classifier, theoretically quantifying how privacy guarantee $\epsilon$ affects its generalization performance. Our theoretical result shows: (1) the randomized response mechanism slows down the convergence of excess risk by lessening the multiplicative constant term compared with the non-private case $(\epsilon=\infty)$; (2) as $\epsilon$ decreases, the optimal structure of the neural network should be smaller for better generalization performance; (3) the convergence of its excess risk is guaranteed even if $\epsilon$ is adaptive to the size of training sample $n$ at a rate slower than $O(n^{-1/2})$. Our theoretical results are validated by extensive simulated examples and two real applications.
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Length: Regular submission (no more than 12 pages of main content)
Supplementary Material: zip
Assigned Action Editor: ~Florian_Tramer1
Submission Number: 1278