CURE: A Unified Framework for Class and Concept Unlearning via Retraining Emulation

20 Sept 2025 (modified: 12 Feb 2026)ICLR 2026 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Machine Unlearning, Knowledge Distillation, Deep Learning, Classification, Large Language Models
TL;DR: We proposed an unlearning framework for classification and Large Language Model that orthogonally combining gradients from two distillation-based objectives.
Abstract: Driven by evolving data regulations and the need for trustworthy AI, machine unlearning (MU) addresses the critical challenge of efficiently removing undesirable knowledge from models without the prohibitive cost of retraining. However, existing MU methods have limitations in balancing the complete removal of target information and the degradation of performance on remaining data. In datasets with rich concept hierarchies, an additional trade-off exists between retaining knowledge of closely related concepts and that of more general, unrelated ones. We propose a Class and Concept Unlearning via Retraining Emulation (CURE) framework that preserves model performance by emulating retraining via knowledge distillation. We first formulate two unlearning strategies with mathematical justification: Guided Hard Relabeling (GHR) with cross-entropy and Guided Soft Relabeling (GSR) with Kullback-Leibler (KL) divergence. In datasets with extensive semantic hierarchies, we observe a key trade-off: GHR offers concentrated preservation for a small subset of closely related retain concepts, while GSR is more effective at preserving the wider set of dissimilar retain concepts. To unify these benefits, we introduce Guided Restricted Orthogonal Gradient Unlearning (GROGU) to optimize the update step by orthogonally combining gradients from two distillation-based objectives. Experiments on the image classification benchmarks CIFAR-10, CIFAR-100, and ImageNet, as well as on large language models (LLMs), show that our methods achieve superior target erasure while preserving accuracy on retained data, outperforming existing techniques.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 22197
Loading