Automatic Clipping: Differentially Private Deep Learning Made Easy and Stronger

Zhiqi Bu; Yu-Xiang Wang; Sheng Zha; George Karypis

Automatic Clipping: Differentially Private Deep Learning Made Easy and Stronger

Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, George Karypis

16 May 2022 (modified: 06 Apr 2025)NeurIPS 2022 SubmittedReaders: Everyone

Keywords: deep learning, differential privacy, per-sample gradient clipping, convergence

TL;DR: We propose automatic DP optimizers that do not need to tune the clipping norm, with convergence proof and SOTA accuracy.

Abstract: Per-example gradient clipping is a key algorithmic step that enables practical differential private (DP) training for deep learning models. The choice of clipping norm $R$, however, is shown to be vital for achieving high accuracy under DP. We propose an easy-to-use replacement, called AutoClipping, that eliminates the need to tune $R$ for any DP optimizers, including DP-SGD, DP-Adam, DP-LAMB and many others. The automatic variants are as private and computationally efficient as existing DP optimizers, but require no DP-specific hyperparameters and thus make DP training as amenable as the standard non-private training. We give a rigorous convergence analysis of automatic DP-SGD in the non-convex setting, which shows that it can enjoy an asymptotic convergence rate that matches the standard SGD, under a symmetric noise assumption of the per-sample gradients. We also demonstrate on various language and vision tasks that automatic clipping outperforms or matches the state-of-the-art, and can be easily employed with minimal changes to existing codebases.

Supplementary Material: pdf

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 7 code implementations](https://www.catalyzex.com/paper/automatic-clipping-differentially-private/code)

21 Replies

Loading