RPWithPrior: Label Differential Privacy in Regression

Published: 08 Jun 2026, Last Modified: 08 Jun 2026Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: With the wide application of machine learning techniques in practice, privacy preservation has gained increasing attention. Protecting user privacy with minimal accuracy loss is a fundamental task in the data analysis and mining community. In this paper, we focus on regression tasks under $\epsilon$-label differential privacy guarantees. Some existing methods fundamentally convert a regression problem into a classification problem within the framework of Label Differential Privacy. However, such operations does not align well with real-world scenarios. To overcome these limitations, we model both original and randomized responses as continuous random variables, avoiding discretization entirely. Our novel approach estimates an optimal interval for randomized responses and introduces new algorithms designed for scenarios where a prior is either known or unknown. Additionally, we prove that our algorithm, RPWithPrior, guarantees $\epsilon$-label differential privacy. Numerical results show that our method is always the best on the Communities and Crime. On Criteo Sponsored Search Conversion Log, and California Housing datasets, the performance of our approach remains comparable.
Submission Type: Regular submission (no more than 12 pages of main content)
Code: https://github.com/liuhaixias1/Response_privacy/
Assigned Action Editor: ~Antti_Koskela1
Submission Number: 7424
Loading