Simple Baselines Are Strong Performers for Differentially Private Natural Language Processing

Xuechen Li; Florian Tramer; Percy Liang; Tatsunori Hashimoto

Simple Baselines Are Strong Performers for Differentially Private Natural Language Processing

Xuechen Li, Florian Tramer, Percy Liang, Tatsunori Hashimoto

Published: 04 Nov 2021, Last Modified: 15 May 2023PRIML 2021 OralReaders: Everyone

Keywords: differential privacy, language models, fine-tuning, NLP

TL;DR: We show that with the right setup, fine-tuning language models with DP-Adam has strong performance on datasets of modest sizes.

Abstract: Differentially private learning has seen limited success for deep learning models of text, resulting in a perception that differential privacy may be incompatible with the language model fine-tuning paradigm. We demonstrate that this perception is inaccurate and that with the right setup, high performing private models can be learned on moderately-sized corpora by directly fine-tuning with differentially private optimization. Our work highlights the important role of hyperparameters, task formulations, and pretrained models. Our analyses also show that the low performance of naive differentially private baselines in prior work is attributable to suboptimal choices in these factors. Empirical results reveal that differentially private optimization does not suffer from dimension-dependent performance degradation with pretrained models and achieves performance on-par with state-of-the-art private training procedures and strong non-private baselines.

Paper Under Submission: The paper is NOT under submission at NeurIPS

1 Reply

Loading