Dirichlet Continual Learning: Tackling Catastrophic Forgetting in NLP

Published: 26 Apr 2024, Last Modified: 15 Jul 2024UAI 2024 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Continual Learning, Generative-based rehearsal, catastrophic forgetting
Abstract: Catastrophic forgetting poses a significant challenge in continual learning (CL). In the context of Natural Language Processing, generative-based rehearsal CL methods have made progress in avoiding expensive retraining. However, generating pseudo samples that accurately capture the task-specific distribution remains a daunting task. In this paper, we propose Dirichlet Continual Learning (DCL), a novel generative-based rehearsal strategy designed specifically for CL. Different from the conventional use of Gaussian latent variable in Conditional Variational Autoencoder, DCL employs the flexibility of the Dirichlet distribution to model the latent variable. This allows DCL to effectively capture sentence-level features from previous tasks and guide the generation of pseudo samples. Additionally, we introduce Jensen-Shannon Knowledge Distillation, a robust logit-based knowledge distillation method that enhances knowledge transfer during pseudo-sample generation. Our extensive experiments show that DCL outperforms state-of-the-art methods in two typical tasks of task-oriented dialogue systems, demonstrating its efficacy.
List Of Authors: Min, Zeng and Haiqin, Yang and Wei, Xue and Qifeng, Liu and Yike, Guo
Latex Source Code: zip
Signed License Agreement: pdf
Submission Number: 413
Loading