Keywords: machine unlearning theory, overparameterization
Abstract: We study the unlearning problem in the overparameterized regime, where many models interpolate the data. In this setting, defining the unlearning solution as any loss minimizer over the retained data—as in prior work in the underparameterized case—is inadequate, since the original model may already interpolate the retained data and satisfy this condition. Further, loss gradients vanish, rendering prior methods based on loss gradient perturbations ineffective, motivating new unlearning definitions and algorithms. We define the unlearning solution as the minimum-complexity interpolator of the retained data and propose a framework to recover this solution that minimizes a regularized objective under a relaxation of the interpolation constraint, enforcing the perturbation of the original model to be orthogonal to the model gradients on the retained data. For different model classes, we provide exact and approximate unlearning guarantees, and we show that an implementation of our framework outperforms existing baselines across unlearning experiments.
Submission Number: 35
Loading