Exact Unlearning of Finetuning Data via Model Merging at Scale

Kevin Kuo; Amrith Setlur; Kartik Srinivas; Aditi Raghunathan; Virginia Smith

Exact Unlearning of Finetuning Data via Model Merging at Scale

Kevin Kuo, Amrith Setlur, Kartik Srinivas, Aditi Raghunathan, Virginia Smith

Published: 06 Mar 2025, Last Modified: 06 Mar 2025MCDC @ ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: merging, localization, task arithmetic, unlearning, masking

TL;DR: We propose an efficient framework for machine unlearning based on model merging and localization.

Abstract: Approximate unlearning has gained popularity as an approach to efficiently update a model so it (roughly) behaves as if it was not trained on a subset of data. However, approximate unlearning methods have been shown to be quite brittle in practice. In fact, such approaches can easily be attacked to reveal supposedly unlearned information. To address this issue, we instead propose a *model merging* approach, **ClAMU**, which produces combined models that can support both *efficient and exact* deletion of unlearning data. In addition to leveraging techniques from model merging and localization, **ClAMU** relies on two key innovations. First, we cluster tasks together and serve per-cluster models, balancing the tradeoff between the utility of local models versus the storage cost of a global model. Second, unlike existing localization methods which compress local models into masks, we propose directly optimizing local (or cluster-level) masks, which greatly improves utility. Relative to model merging and localization baselines, **ClAMU** serves models with up to 20% improved accuracy while reducing storage costs by up to 75%.

Submission Number: 36

Loading