MKT: A Multi-Stage Knowledge Transfer Framework to Mitigate Catastrophic Forgetting in Multi-Domain Chinese Spelling Correction
Abstract: Chinese Spelling Correction (CSC) aims to detect and correct spelling errors in given sentences. Recently, multi-domain CSC has gradually attracted the attention of researchers because it is more practicable. In this paper, we focus on the key flaw of the CSC model when adapting to multi-domain scenarios: the tendency to forget previously acquired knowledge upon learning new domain-specific knowledge (i.e., $\textbf{catastrophic forgetting}$). To address this, we propose a novel model-agnostic $\textbf{M}$ulti-stage $\textbf{K}$nowledge $\textbf{T}$ransfer ($\textbf{MKT}$) framework with an evolving teacher model and dynamic distillation weights for knowledge transfer in each domain, rather than focusing solely on new domain knowledge. It deserves to be mentioned that we are the first to apply continual learning methods to the multi-domain CSC task. Experiments prove our method's effectiveness over traditional approaches, highlighting the importance of overcoming catastrophic forgetting to enhance model performance.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: Educational Applications, Chinese Spelling Correction
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models
Languages Studied: Chinese
Submission Number: 2904
Loading