Learning-Forgetting Optimality in Supervised Finetuning: A Cliff Perspective

Published: 29 May 2026, Last Modified: 08 Jun 2026HiLD at ICML 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: finetuning, catastrophic forgetting, post-training quantization, weight-space symmetries
TL;DR: We propose a function-preserving flattening of the loss landscape, and show that SFT either exhibits a gradual learning–forgetting trade-off (where SIB and continual learning methods help) or a sharp "cliff" (where they do not).
Abstract: Supervised finetuning (SFT) of pretrained language models trades off acquisition of new domain capabilities against retention of prior knowledge. Increasingly, Post-training quantization (PTQ) and forgetting from SFT are seen as a loss geometry problem, where flatness leads to lower degradation. In this work, we adopt a unified view of post-training perturbations. In particular, inspired by PTQ we propose Scale Invariant Balancing (SIB) a functionally equivalent parameterization of the model that flattens the loss landscape within the weight-space symmetries. Moreover, we extensively characterize the learning-forgetting trade-off of plain SFT, SIB, and various classical continual learning methods to find that, across models and methods, two regimes universally arise. Either baseline SFT performance appears as a gradual trade-off between learning and forgetting, in which case SIB can be applied to improve Pareto optimality. Or SFT trajectories develop into a sharp cliff: a sharp phase transition where training recipes flip from learning without forgetting into catastrophic forgetting without improvements in learning, in which case continual learning methods do not substantially intervene.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 133
Loading