Few-shot Unlearning

Published: 01 Jan 2024, Last Modified: 13 May 2025SP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We consider the problem of machine unlearning to erase the impact of a target dataset, used in training but incorrect or sensitive, from a trained model. It has been often presumed that every data sample to erase or remain is entirely identifiable and thus clarifies the desired model behavior after unlearning. However, such a flawless identification can be infeasible in practice. We pose a further realistic yet challenging scenario, referred to as few-shot unlearning, where only a few samples of target data are provided while aiming at achieving the underlying intention (e.g., correcting mislabels, countering a certain privacy attack, or specifying nothing) behind the full target dataset. We then devise a few-shot unlearning method including a new model inversion technique, specialized for unlearning scenarios, to retrieve a proxy of the training dataset from the trained model if needed. We demonstrate that our method using only a tiny subset of target data can achieve similar performance to the state-of-the-art methods with full access to target data. Our code and results are available at https://github.com/ml-postech/Few-shot-Unlearning.
Loading