Striking a Balance: An Optimal Mechanism Design for Heterogenous Differentially Private Data Acquisition for Logistic Regression

TMLR Paper2485 Authors

06 Apr 2024 (modified: 16 Apr 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We investigate the problem of solving ML tasks from data collected from privacy-sensitive sellers. Since the data is private, sellers must be incentivized through payments to provide their data. Thus, the goal is to design a mechanism that optimizes a weighted combination of test loss, seller privacy, and payment, i.e., strikes a balance between getting a good privacy-preserving ML model and limiting payments to the sellers. To do this, we first solve logistic regression with known heterogeneous differential privacy guarantees. We then consider the main problem where the differential privacy requirements are decided by the buyer to balance the tradeoff between test loss and payments. To solve this problem, we use our earlier result on logistic regression with known privacy guarantees along with standard mechanism design theory to formulate an optimization problem which is nonconvex. We establish conditions under which the problem can be convexified using a change of variables technique. This insight is then harnessed to develop an algorithm that provides optimal solution. Additionally, we demonstrate the resilience of our mechanism to scenarios in which data points and privacy sensitivities are correlated. Finally, we demonstrate the utility of our algorithm by applying it to the Wisconsin breast cancer dataset.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Surbhi_Goel1
Submission Number: 2485
Loading