Provably Continual Unlearning for Large Language Model

Provably Continual Unlearning for Large Language Model

ICLR 2026 Conference Submission420 Authors

01 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: ML

Abstract: Continual unlearning in large language models (LLMs) requires forgetting targeted domains while preserving utility elsewhere as requests arrive sequentially. Existing approaches are largely heuristic and accumulate interference over time. We present a principled \emph{optimization} framework \methodname~ (\textbf{S}pectral \textbf{O}rthogonality for \textbf{C}ontinual unlearning with \textbf{P}rovable guarant\textbf{E}es) that formalizes continual unlearning via three explicit conditions: \emph{selective forgetting}, \emph{utility preservation}, and \emph{persistence}, and satisfies them by parameterizing updates in an orthonormal spectral basis with disjoint coefficient supports. This construction enforces orthogonality by design, yields capacity laws that bound interference as requests accumulate, and admits an efficient FFT-based instantiation that needs no basis storage and scales as $O(d\log d)$. The same parameterization provides an inference-time routing signal via spectral activations, enabling calibrated triggering of unlearning adapters. Across discriminative, generative, and reasoning benchmarks-and without using retained data from unaffected domains where our method delivers stronger unlearning–utility trade-offs and more stable scaling than competitive baselines, offering a scalable framework with explicit guarantees for continual unlearning in LLMs.

Primary Area: other topics in machine learning (i.e., none of the above)

Submission Number: 420

Loading