Rotation-Preserving Supervised Fine-Tuning

Published: 23 May 2026, Last Modified: 23 May 2026CATS@ICML26 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Supervised Fine-Tuning, Generalization, Catastrophic Forgetting, Representation Drift, Large Language Models
Abstract: Supervised fine-tuning (SFT) improves target-task performance but can rotate pretrained representations in ways that reduce out-of-domain generalization. We propose Rotation-Preserving Supervised Fine-Tuning (RPSFT), a simple regularizer that penalizes drift in the projected top-$k$ singular-vector block of pretrained weight matrices. Across Llama and Qwen models trained on OpenR1-Math, RPSFT improves the in-domain/OOD trade-off over SFT, importance-weighted SFT, and Dynamic Fine-Tuning, and gives strong initializations for downstream RL fine-tuning. The results suggest that controlling dominant-subspace rotation is a practical way to retain pretrained structure while still allowing task adaptation.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 43
Loading