Towards Efficient and Scalable Training of Differentially Private Deep Learning

Published: 18 Jun 2024, Last Modified: 25 Jun 2024WANT@ICML 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: differential privacy, gradient based optimization, computational efficiency, distributed computing
TL;DR: We study the computational efficiency of the differentially private stochastic gradient descent algorithm, methods that reduce the cost and scale up the training to many GPUs.
Abstract: Differentially private stochastic gradient descent (DP-SGD) is the standard algorithm for training machine learning models under differential privacy (DP). The major drawback of DP-SGD is the drop in utility which prior work has comprehensively studied. However, in practice another major drawback that hinders the large-scale deployment is the significantly higher computational cost. We conduct a comprehensive empirical study to quantify the computational cost of training deep learning models under DP and benchmark methods that aim at reducing the cost. Among these are more efficient implementations of DP-SGD and training with lower precision. Finally, we study the scaling behaviour using up to 80 GPUs.
Submission Number: 20
Loading