Exclusive Unlearning: Forgetting All Except What You Need

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: knowledge unlearning, large language models
TL;DR: Unlike conventional unlearning, we preserve only the dataset-specified ability and forget all others.
Abstract: Large language models (LLMs) acquire diverse knowledge and abilities through pretraining, but broad and uncontrolled capabilities are not always desirable. Conventional unlearning aims to erase specific knowledge while preserving general fluency and reasoning. However, when the target task is clearly defined, we can remove all abilities that are not required to perform that task, rather than attempting to control a wide range of undesirable behaviors. For example, customer-care chatbots should only answer anticipated questions, and in education, a subject-specific model may be preferable to prevent unintended use. In this paper, we take the opposite perspective from conventional unlearning and propose a method that preserves only the ability specified by a dataset while forgetting all other knowledge and abilities. Our approach is remarkably simple: we train the model on the target task via standard fine-tuning while simultaneously forcing the probability distribution over the model's generated texts to become uniform. This ensures that the model retains the capability required for the target task, while forgetting all other abilities. We demonstrate that our method successfully retains specific abilities (extractive QA and mathematical QA) while forgetting all other knowledge and abilities. Furthermore, we show that our method more effectively removes all abilities except the designated one compared to a standard unlearning approach.
Supplementary Material: zip
Primary Area: foundation or frontier models, including LLMs
Submission Number: 23176
Loading