MaGA: Machine-Guided Amnesiac Unlearning through Target Feature Disentanglement

Haoyu Wang; Zhuo Huang; Xiaolong Wang; Bo Han; Zhiwei Lin; Tongliang Liu

MaGA: Machine-Guided Amnesiac Unlearning through Target Feature Disentanglement

Haoyu Wang, Zhuo Huang, Xiaolong Wang, Bo Han, Zhiwei Lin, Tongliang Liu

17 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Machine Unlearning, Multimodal Learning, Large Language Models

Abstract: The security of training data has raised the ``Right to be Forgotten'' policy to protect the privacy of data providers, leading to an urgent need for effective Machine Unlearning. However, existing unlearning methods often face a trade-off dilemma between fully erasing the influence of target data and preserving the overall model capability. To address this, we first investigate the intrinsic characteristics of class concepts learned during model pretraining, revealing that these concepts are often entangled at the feature pattern level. Based upon this insight, we introduce Machine-Guided Amnesiac (MaGA), a novel unlearning framework to manipulate the unlearning process via leveraging Multi-modal Large Language Models to estimate conceptual similarities between features. These similarities are encoded in a transition matrix to assign suitable perturbing labels for re-alignment of target data to achieve unlearning. This facilitates effective unlearning, as it perturbs the concepts related to target instances, thus reducing undesired model disruption. Furthermore, we propose a Fragment-Absorb strategy to disentangle the influence of target concepts through a positive-negative feature noise pair. During unlearning, both feature noises are leveraged to impede target feature patterns while enhancing the remaining desired features. This promotes selective forgetting of target data influence, smoothing complete unlearning while mitigating the risks of under-unlearning or over-unlearning. Extensive experiments conducted across typical unlearning tasks and diverse datasets demonstrate that our approach outperforms existing baselines, effectively removing target data while preserving the model generalization on retained data.

Supplementary Material: zip

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 8241

Loading