Capacity-Gated Forgetting in LoRA Fine-Tuning: Rank, Proximity, and Endogenous Replay in Medical LLMs
Keywords: catastrophic forgetting, LoRA, QLoRA, continual fine-tuning, medical language models, MMLU, replay, capacity-gated forgetting, endogenous replay
TL;DR: LoRA forgetting transitions from uniform to proximity-structured at a capacity threshold; endogenous replay (sampling from the base model on anchor prompts) cuts MMLU forgetting 80% without sacrificing MedQA accuracy.
Abstract: LoRA fine-tuning on narrow domains improves target behaviour but can erase broad pretrained competence. Existing accounts disagree over scale, semantic proximity, and adapter rank, often using incompatible models, tasks, and evaluation protocols. We run a controlled 11-experiment battery on Qwen3.5-9B-Base fine-tuned on MedQA-USMLE, with per-subject MMLU evaluation across 57 subjects. We propose Capacity-Gated Forgetting (CGF): the capacity ratio $\rho = r/d$ induces two categorical regimes, uniform forgetting below a critical threshold $\rho \approx 1$ and proximity-structured forgetting above it. Rank dominates forgetting ($\chi^2 = 166.5$, $p < 10^{-35}$); E02 forgets 0.352, while replay reduces forgetting to 0.180 with 50 real examples, 0.142 with 100 real examples, and 0.070 with 100 endogenous examples. Endogenous Replay yields an 80% reduction over no replay and approximately 50% over real replay, with a KL anchoring argument explaining its sample efficiency. A matched-configuration MedQA target-accuracy check (K0) confirms Endogenous Replay does not sacrifice fine-tuning performance: at $r = 16$, 500 steps, E11 reaches 78.5% MedQA accuracy versus 78.0% (E02, no replay) and 77.5% (E10, real replay) — a Pareto improvement on both axes. K1-K4 remain future validation checks.
Paper Type: Long (8 pages)
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 147
Loading