Teacher-Forced Selective Self-Distillation for Uncurated Replay Data

ACL ARR 2025 July Submission252 Authors

26 Jul 2025 (modified: 05 Sept 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Continual fine-tuning involves incrementally training a language model to acquire knowledge of new tasks. This learning paradigm introduces the challenge of catastrophic forgetting, where models tend to forget previously learned tasks as they adapt to new ones. Several techniques have been proposed to address this issue, including regularization, parameter-isolation, and replay-based approaches. Among these, replay-based methods have gained wider adoption due to their less invasive nature and ease of integration into existing continual learning pipelines. However, in real-world settings, replay-based methods face the practical challenge of curating ideal replay samples. This leads to the use of noisy replay data from the task owner, which is often suboptimal for improving task performance. To address this crucial real-world challenge, we introduce Teacher-Forced Selective Self-Distillation (TF-SSD) a novel method that employs self-distillation of the labels from the task stage model and refine the less effective samples using mixture of teachers framework. Our experiments involving challenging 16 task continual learning setting demonstrate that TF-SSD outperforms best-performing baseline by $\sim$2.7 points in task performance and $\sim$2.8 points in mitigating catastrophic forgetting across $2$ model families: Llama2 7B and Granite3.3 2B. We are planning to open-source the code of TF-SSD.
Paper Type: Long
Research Area: Language Modeling
Research Area Keywords: Continual learning, fine-tuning
Contribution Types: NLP engineering experiment
Languages Studied: English
Reassignment Request Area Chair: This is not a resubmission
Reassignment Request Reviewers: This is not a resubmission
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: Yes
A2 Elaboration: Ethics Statement
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: 2, 4
B2 Discuss The License For Artifacts: Yes
B2 Elaboration: 4
B3 Artifact Use Consistent With Intended Use: Yes
B3 Elaboration: 4
B4 Data Contains Personally Identifying Info Or Offensive Content: Yes
B4 Elaboration: Ethics Statement
B5 Documentation Of Artifacts: Yes
B5 Elaboration: 4
B6 Statistics For Data: Yes
B6 Elaboration: 4, Appendix
C Computational Experiments: Yes
C1 Model Size And Budget: Yes
C1 Elaboration: 4, Appendix
C2 Experimental Setup And Hyperparameters: Yes
C2 Elaboration: 4, Appendix
C3 Descriptive Statistics: Yes
C3 Elaboration: 5
C4 Parameters For Packages: Yes
C4 Elaboration: Appendix
D Human Subjects Including Annotators: No
D1 Instructions Given To Participants: N/A
D2 Recruitment And Payment: N/A
D3 Data Consent: N/A
D4 Ethics Review Board Approval: N/A
D5 Characteristics Of Annotators: N/A
E Ai Assistants In Research Or Writing: No
E1 Information About Use Of Ai Assistants: N/A
Author Submission Checklist: yes
Submission Number: 252
Loading