Larger Model Causes Lower Classification Accuracy Under Differential Privacy: Reason and SolutionDownload PDF

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone
Keywords: Differential privacy, feature selection, generalization, high dimension.
Abstract: Differential privacy (DP) is an essential technique for privacy-preserving, which works by adding random noise to the data. In deep learning, DP-stochastic gradient descent (SGD) is a popular technique to build privacy-preserving models. With a small noise, however, the large model (such as ResNet50) trained by DP-SGD cannot perform better than the small model (such as ResNet18). To better understand this phenomenon, we study high dimensional DP learning from the viewpoint of generalization. Theoretically, we first demonstrate that for the Gaussian mixture model with even small DP noise, if excess features are used, classification can be as bad as the random guessing since the noise accumulation for the estimation in high dimensional feature space. Then we propose a robust measure to select the important features, which trades off the model accuracy and privacy preserving. Moreover, the conditions under which important features can be selected by the proposed measure are established. Simulation on the real data (such as CIFAR-10) supports our theoretical results and reveals the advantage of the proposed classification and privacy preserving procedure.
One-sentence Summary: The larger models cause lower classification accuracy under differential privacy.
Supplementary Material: zip
5 Replies

Loading