Are We Really Unlearning? The Presence of Residual Knowledge in Machine Unlearning

Published: 05 Mar 2025, Last Modified: 18 Mar 2025ICLR 2025 Workshop ICBINBEveryoneRevisionsBibTeXCC BY 4.0
Track: long paper (up to 4 pages)
Keywords: machine unlearning, residual knowledge, adversarial attacks
Abstract:

Machine unlearning seeks to remove a set of forget samples from a pre-trained model to comply with emerging privacy regulations. While existing machine unlearning algorithms focus on effectiveness by either achieving indistinguishability from a re-trained model or closely matching its accuracy, they often overlook the vulnerability of unlearned models to slight perturbations of forget samples. In this paper, we identify a novel privacy vulnerability in unlearning, which we term residual knowledge. We find that even when an unlearned model no longer recognizes a forget sample---effectively removing direct knowledge of the sample---residual knowledge often persists in its vicinity, which a re-trained model does not recognize at all. Addressing residual knowledge should become a key consideration in the design of future unlearning algorithms.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 18
Loading