Effective Unlearning in LLMs Relies on the Right Data Retention Strategy

Effective Unlearning in LLMs Relies on the Right Data Retention Strategy

ICLR 2026 Conference Submission18548 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM Unlearning, data perspective, retain set selection

TL;DR: We focus on Retain set selection for LLM Unlearning in realistic scenarios. Additionally, we investigate how this selection is impacting forget quality and model utility through hidden state representations.

Abstract: Unlearning in Large Language Models (LLMs) has gained increasing attention in recent years due to its critical role in ensuring ethical and legal compliance. Although significant progress has been made in developing unlearning algorithms, relatively little attention has been devoted to the data perspective. In particular, the role of retain-set selection in preserving model utility remains underexplored, even though it is critical for making unlearning practical in real-world applications. In this work, we explore strategies for constructing effective retain sets by adapting methods from coreset selection and prior unlearning research. We evaluate these approaches on two complementary datasets: (i) a monotonic dataset built from a benchmark dataset, and (ii) a mixed, larger-scale dataset combining WPU, TOFU, and Dolly, which better reflects realistic scenarios where forget and retain samples are not explicitly defined. We find that both model utility and forget quality are strongly influenced by the variance of the model’s representations within the selected retain set. Moreover, we show that simply choosing data samples with high semantic or syntactic similarity to the forget set can yield substantially better results than standard coreset techniques. To the best of our knowledge, this work represents the first systematic study of retain-set selection for LLM unlearning, highlighting both its importance and the challenges it poses in practical settings.

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 18548

Loading