Leveraging Per-Instance Privacy for Machine Unlearning

Nazanin Mohammadi Sepahvand; Anvith Thudi; Berivan Isik; Ashmita Bhattacharyya; Nicolas Papernot; Eleni Triantafillou; Daniel M. Roy; Gintare Karolina Dziugaite

Leveraging Per-Instance Privacy for Machine Unlearning

Nazanin Mohammadi Sepahvand, Anvith Thudi, Berivan Isik, Ashmita Bhattacharyya, Nicolas Papernot, Eleni Triantafillou, Daniel M. Roy, Gintare Karolina Dziugaite

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We show that per-instance privacy levels, computed during training, provide a practical and reliable way to predict unlearning difficulty in fine-tuning-based methods, enabling more efficient and targeted unlearning strategies.

Abstract: We present a principled, per-instance approach to quantifying the difficulty of unlearning via fine-tuning. We begin by sharpening an analysis of noisy gradient descent for unlearning (Chien et al., 2024), obtaining a better utility–unlearning trade-off by replacing worst-case privacy loss bounds with per-instance privacy losses (Thudi et al., 2024), each of which bounds the (R ´enyi) divergence to retraining without an individual datapoint. To demonstrate the practical applicability of our theory, we present empirical results showing that our theoretical predictions are born out both for Stochastic Gradient Langevin Dynamics (SGLD) as well as for standard fine-tuning without explicit noise. We further demonstrate that per-instance privacy losses correlate well with several existing data difficulty metrics, while also identifying harder groups of data points, and introduce novel evaluation methods based on loss barriers. All together, our findings provide a foundation for more efficient and adaptive unlearning strategies tailored to the unique properties of individual data points.

Lay Summary: In scenarios including following legislation, or corrupted training data, a model trainer is required to "forget" some part of their training dataset. We make the connection that a metric derived from statistics collected during training can be predictive of how hard it will be to forget a datapoint. Theoretically we prove that this metric provides an upper bound on how many steps of gradient descent are required to forget a datapoint. Empirically we find across training setups, this metric accurately ranks datapoints by how many gradient descent steps they require to be forgotten. Moreover, we find our proposed metrics discovers harder to forget datapoints, compared to past approaches to identifying difficult data points.

Primary Area: Deep Learning

Keywords: machine unlearning, differential privacy

Submission Number: 14447

Loading