Teacher-Forced Selective Self-Distillation for Uncurated Replay Data

Teacher-Forced Selective Self-Distillation for Uncurated Replay Data

ACL ARR 2025 July Submission252 Authors

26 Jul 2025 (modified: 05 Sept 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Continual fine-tuning involves incrementally training a language model to acquire knowledge of new tasks. This learning paradigm introduces the challenge of catastrophic forgetting, where models tend to forget previously learned tasks as they adapt to new ones. Several techniques have been proposed to address this issue, including regularization, parameter-isolation, and replay-based approaches. Among these, replay-based methods have gained wider adoption due to their less invasive nature and ease of integration into existing continual learning pipelines. However, in real-world settings, replay-based methods face the practical challenge of curating ideal replay samples. This leads to the use of noisy replay data from the task owner, which is often suboptimal for improving task performance. To address this crucial real-world challenge, we introduce Teacher-Forced Selective Self-Distillation (TF-SSD) a novel method that employs self-distillation of the labels from the task stage model and refine the less effective samples using mixture of teachers framework. Our experiments involving challenging 16 task continual learning setting demonstrate that TF-SSD outperforms best-performing baseline by $\sim$2.7 points in task performance and $\sim$2.8 points in mitigating catastrophic forgetting across $2$ model families: Llama2 7B and Granite3.3 2B. We are planning to open-source the code of TF-SSD.

Paper Type: Long

Research Area: Language Modeling

Research Area Keywords: Continual learning, fine-tuning

Contribution Types: NLP engineering experiment

Languages Studied: English

Reassignment Request Area Chair: This is not a resubmission

Reassignment Request Reviewers: This is not a resubmission

A1 Limitations Section: This paper has a limitations section.

A2 Potential Risks: Yes

A2 Elaboration: Ethics Statement

B Use Or Create Scientific Artifacts: Yes

B1 Cite Creators Of Artifacts: Yes

B1 Elaboration: 2, 4

B2 Discuss The License For Artifacts: Yes

B2 Elaboration: 4

B3 Artifact Use Consistent With Intended Use: Yes

B3 Elaboration: 4

B4 Data Contains Personally Identifying Info Or Offensive Content: Yes

B4 Elaboration: Ethics Statement

B5 Documentation Of Artifacts: Yes

B5 Elaboration: 4

B6 Statistics For Data: Yes

B6 Elaboration: 4, Appendix

C Computational Experiments: Yes

C1 Model Size And Budget: Yes

C1 Elaboration: 4, Appendix

C2 Experimental Setup And Hyperparameters: Yes

C2 Elaboration: 4, Appendix

C3 Descriptive Statistics: Yes

C3 Elaboration: 5

C4 Parameters For Packages: Yes

C4 Elaboration: Appendix

D Human Subjects Including Annotators: No

D1 Instructions Given To Participants: N/A

D2 Recruitment And Payment: N/A

D3 Data Consent: N/A

D4 Ethics Review Board Approval: N/A

D5 Characteristics Of Annotators: N/A

E Ai Assistants In Research Or Writing: No

E1 Information About Use Of Ai Assistants: N/A

Author Submission Checklist: yes

Submission Number: 252

Loading