Label Robust and Differentially Private Linear Regression: Computational and Statistical Efficiency

Xiyang Liu; Prateek Jain; Weihao Kong; Sewoong Oh; Arun Suggala

Label Robust and Differentially Private Linear Regression: Computational and Statistical Efficiency

Xiyang Liu, Prateek Jain, Weihao Kong, Sewoong Oh, Arun Suggala

Published: 21 Sept 2023, Last Modified: 02 Nov 2023NeurIPS 2023 posterEveryoneRevisionsBibTeX

Keywords: Differential Privacy; Private Estimation

TL;DR: We provide the first efficient algorithm that achieves a near-optimal rate for DP linear regression and achieves privacy and robustness against label corruptions simultaneously.

Abstract: We study the canonical problem of linear regression under $(\varepsilon,\delta)$-differential privacy when the datapoints are sampled i.i.d.~from a distribution and a fraction of response variables are adversarially corrupted. We provide the first provably efficient -- both computationally and statistically -- method for this problem, assuming standard assumptions on the data distribution. Our algorithm is a variant of the popular differentially private stochastic gradient descent (DP-SGD) algorithm with two key innovations: a full-batch gradient descent to improve sample complexity and a novel adaptive clipping to guarantee robustness. Our method requires only linear time in input size, and still matches the information theoretical optimal sample complexity up to a data distribution dependent condition number factor. Interestingly, the same algorithm, when applied to a setting where there is no adversarial corruption, still improves upon the existing state-of-the-art and achieves a near optimal sample complexity.

Supplementary Material: pdf

Submission Number: 8879

Loading