Fine-tuning can cripple your foundation model; preserving features may be the solution

Jishnu Mukhoti; Yarin Gal; Philip Torr; Puneet K. Dokania

Fine-tuning can cripple your foundation model; preserving features may be the solution

Jishnu Mukhoti, Yarin Gal, Philip Torr, Puneet K. Dokania

Published: 28 Jun 2024, Last Modified: 09 May 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Event Certifications: iclr.cc/ICLR/2025/Journal_Track

Abstract: Pre-trained foundation models, due to their enormous capacity and exposure to vast amounts of data during pre-training, are known to have learned plenty of real-world concepts. An important step in making these pre-trained models effective on downstream tasks is to fine-tune them on related datasets. While various fine-tuning methods have been devised and have been shown to be highly effective, we observe that a fine-tuned model's ability to recognize concepts on tasks different from the downstream one is reduced significantly compared to its pre-trained counterpart. This is an undesirable effect of fine-tuning as a substantial amount of resources was used to learn these pre-trained concepts in the first place. We call this phenomenon "concept forgetting'' and via experiments show that most end-to-end fine-tuning approaches suffer heavily from this side effect. To this end, we propose a simple fix to this problem by designing a new fine-tuning method called LDIFS (short for $\ell_2$ distance in feature space) that, while learning new concepts related to the downstream task, allows a model to preserve its pre-trained knowledge as well. Through extensive experiments on 10 fine-tuning tasks we show that LDIFS significantly reduces concept forgetting. Additionally, we show that LDIFS is highly effective in performing continual fine-tuning on a sequence of tasks as well, in comparison with both fine-tuning as well as continual learning baselines.

Certifications: Featured Certification

Submission Length: Regular submission (no more than 12 pages of main content)

Video: https://youtu.be/VVG_KWi-BSM

Code: https://github.com/omegafragger/ldifs_code

Assigned Action Editor: ~Marcus_Rohrbach1

Submission Number: 2137

Loading