A Unified Fast Gradient Clipping Framework for DP-SGD

Weiwei Kong; Andres Munoz medina

A Unified Fast Gradient Clipping Framework for DP-SGD

Weiwei Kong, Andres Munoz medina

Published: 21 Sept 2023, Last Modified: 02 Nov 2023NeurIPS 2023 posterEveryoneRevisionsBibTeX

Keywords: differential privacy, dp-sgd, gradient clipping, computational complexity

TL;DR: This paper presents a framework that unifies existing gradient clipping techniques and improves upon them.

Abstract: A well-known numerical bottleneck in the differentially-private stochastic gradient descent (DP-SGD) algorithm is the computation of the gradient norm for each example in a large input batch. When the loss function in DP-SGD is consists of an intermediate linear operation, existing methods in the literature have proposed decompositions of gradients that are amenable to fast norm computations. In this paper, we present a framework that generalizes the above approach to arbitrary (possibly nonlinear) intermediate operations. Moreover, we show that for certain operations, such as fully-connected and embedding layer computations, further improvements to the runtime and storage costs of existing decompositions can be deduced using certain components of our framework. Finally, preliminary numerical experiments are given to demonstrate the substantial effects of the aforementioned improvements.

Supplementary Material: pdf

Submission Number: 8533

Loading