Minifinetuning: Low-Data Generation Domain Adaptation through Corrective Self-Distillation

Peter Belcak; Greg Heinrich; Pavlo Molchanov; Jan Kautz

Minifinetuning: Low-Data Generation Domain Adaptation through Corrective Self-Distillation

Peter Belcak, Greg Heinrich, Pavlo Molchanov, Jan Kautz

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: minifinetuning, mft, finetuning, low-resource finetuning, self-distillation, corrective distillation

TL;DR: Corrective self-distillation outperforms PEFT methods and can be used in the absence of replay data to mitigate degeneralization due to finetuning.

Abstract: Finetuning language models for a new domain inevitably leads to the deterioration of their general performance. This becomes more pronounced the more limited the finetuning data resource. We introduce minifinetuning (MFT), a method for language model domain adaptation that considerably reduces the effects of overfitting-induced degeneralization in low-data settings and which does so in the absence of any pre-training data for replay. MFT demonstrates 2-10x more favourable specialization-to-degeneralization ratios than standard finetuning across a wide range of models and domains and exhibits an intrinsic robustness to overfitting when data in the new domain is scarce and down to as little as 500 samples. Employing corrective self-distillation that is individualized on the sample level, MFT outperforms parameter-efficient finetuning methods, demonstrates replay-like forgetting mitigation properties, and is composable with either for a combined effect.

Primary Area: foundation or frontier models, including LLMs

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6520

Loading