Towards Safe Machine Unlearning: a Paradigm that Mitigates Performance Degradation

Published: 29 Jan 2025, Last Modified: 29 Jan 2025WWW 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: User modeling, personalization and recommendation
Keywords: Privacy protection, machine unlearning, trustworthy machine learning
Abstract: We study the machine unlearning problem which aims to remove specific training data from a pre-trained machine learning model to allow users to exercise their ‘right to be forgotten’ to protect user privacy. Conventional machine unlearning methods would degrade the model performance after the unlearning procedure. To mitigate the issue, they typically rely on the access to the remaining training data to fine-tune the unlearned model to mitigate the influence of unlearning. However, accessing the remaining training data may not always be practical for different reasons (e.g., data expiration policies, storage limitations, or additional privacy constraints). Machine unlearning without access to the remaining training data poses significant challenges to retaining model performance. In this paper, we study how to unlearn specific training data from a pre-trained model without accessing the remaining training data and protect model performance without dramatically changing the model's parameters. We propose a practical method called Targeted Label Noise Injection. Intuitively, our method assigns incorrect yet controllable labels to the examples that need to be forgotten and fine-tunes the pre-trained model to learn these new labels. This strategy effectively moves the to-be-forgotten examples across the decision boundary with a small impact on the model's overall performance. We theoretically prove the effectiveness of the proposed method and empirically show that it achieves state-of-the-art unlearning performance across various datasets.
Submission Number: 1669
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview