FAST DIFFERENTIALLY PRIVATE-SGD VIA JL PROJECTIONS

Zhiqi Bu; Sivakanth Gopi; Janardhan Kulkarni; Yin Tat Lee; Uthaipon Tantipongpipat

FAST DIFFERENTIALLY PRIVATE-SGD VIA JL PROJECTIONS

Zhiqi Bu, Sivakanth Gopi, Janardhan Kulkarni, Yin Tat Lee, Uthaipon Tantipongpipat

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Deep Learning, Differential Privacy, Optimization Algorithms

Abstract: Differentially Private-SGD (DP-SGD) of Abadi et al. (2016) and its variations are the only known algorithms for private training of large scale neural networks. This algorithm requires computation of per-sample gradients norms which is extremely slow and memory intensive in practice. In this paper, we present a new framework to design differentially private optimizers called DP-SGD-JL and DP-Adam-JL. Our approach uses Johnson–Lindenstrauss (JL) projections to quickly approximate the per-sample gradient norms without exactly computing them, thus making the training time and memory requirements of our optimizers closer to that of their non-DP versions. Our algorithms achieve state-of-the-art privacy-vs-accuracy tradeoffs on MNIST and CIFAR10 datasets while being significantly faster. Unlike previous attempts to make DP-SGD faster which work only on fully-connected or convolutional layers, our algorithms work for any network in a black-box manner which is the main contribution of this paper. To illustrate this, on IMDb dataset, we train a Recurrent Neural Network (RNN) to achieve good privacy-vs-accuracy tradeoff, whereas existing DP optimizers are either inefficient or inapplicable. On RNNs, our algorithms are orders of magnitude faster than DP-SGD for large batch sizes. The privacy analysis of our algorithms is more involved than DP-SGD, we use the recently proposed f-DP framework of Dong et al. (2019). In summary, we design new differentially private training algorithms which are fast, achieve state-of-the-art privacy-vs-accuracy tradeoffs and generalize to all network architectures.

One-sentence Summary: We design new private training algorithms for neural networks that are fast, achieve state-of-the-art privacy-vs-accuracy tradeoffs and generalize to all network architectures.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Reviewed Version (pdf): https://openreview.net/references/pdf?id=iQePV_fqXp

4 Replies

Loading