What You See is What You Get: Principled Deep Learning via Distributional Generalization

Bogdan Kulynych; Yao-Yuan Yang; Yaodong Yu; Jarosław Błasiok; Preetum Nakkiran

What You See is What You Get: Principled Deep Learning via Distributional Generalization

Bogdan Kulynych, Yao-Yuan Yang, Yaodong Yu, Jarosław Błasiok, Preetum Nakkiran

Published: 31 Oct 2022, Last Modified: 15 Jan 2023NeurIPS 2022 AcceptReaders: Everyone

Keywords: deep learning, differential privacy, disparate impact, distributional robustness, DRO, adversarial robustness, robust overfitting, distributional generalization

Abstract: Having similar behavior at training time and test time—what we call a “What You See Is What You Get” (WYSIWYG) property—is desirable in machine learning. Models trained with standard stochastic gradient descent (SGD), however, do not necessarily have this property, as their complex behaviors such as robustness or subgroup performance can differ drastically between training and test time. In contrast, we show that Differentially-Private (DP) training provably ensures the high-level WYSIWYG property, which we quantify using a notion of distributional generalization. Applying this connection, we introduce new conceptual tools for designing deep-learning methods by reducing generalization concerns to optimization ones: to mitigate unwanted behavior at test time, it is provably sufficient to mitigate this behavior on the training data. By applying this novel design principle, which bypasses “pathologies” of SGD, we construct simple algorithms that are competitive with SOTA in several distributional-robustness applications, significantly improve the privacy vs. disparate impact trade-off of DP-SGD, and mitigate robust overfitting in adversarial training. Finally, we also improve on theoretical bounds relating DP, stability, and distributional generalization.

TL;DR: We develop the theoretical connection between differential privacy and distributional generalization, and we leverage our theory to improve empirical performance in privacy, fairness, and distribution robustness applications.

Supplementary Material: zip

14 Replies

Loading