Unlearning Tabular Data Without a "Forget Set''

Aviraj Newatia; Michael Cooper; Rahul Krishnan

Unlearning Tabular Data Without a "Forget Set''

Aviraj Newatia, Michael Cooper, Rahul Krishnan

Published: 10 Oct 2024, Last Modified: 31 Oct 2024TRL @ NeurIPS 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: machine unlearning, tabular representation learning, tabnet, tabular learning, attention, feature unlearning, tabular data

TL;DR: Efficient machine unlearning on tabular data without requiring access to a forget set.

Abstract: Machine unlearning is the process of removing the influence of some subset of the training data from the parameters of a previously-trained model. Existing methods typically require direct access to the “forget set" – the subset of training data to be forgotten by the model. This limitation impedes privacy, as organizations need to retain user data for the sake of unlearning when a request for deletion is made, rather than being able to delete it immediately. We introduce RELOAD, an approximate unlearning algorithm that leverages ideas from gradient-based unlearning and neural network sparsity to achieve blind unlearning in settings of tabular data. The method serially applies an ascent step with targeted parameter re-initialization and fine-tuning, and on empirical unlearning tasks, RELOAD often approximates the behaviour of a from-scratch retrained model better than approaches that leverage the forget set. Empirical results highlight how RELOAD has the potential to improve privacy-preserving machine learning in the tabular setting

Submission Number: 55

Loading