Fine-tuned Language Models can be Continual Learners

Tuhin Chakrabarty; Thomas Scialom; Smaranda Muresan

Fine-tuned Language Models can be Continual Learners

Tuhin Chakrabarty, Thomas Scialom, Smaranda Muresan

Published: 09 Apr 2022, Last Modified: 05 May 2023BigScience#5 ednonarchivalReaders: Everyone

Keywords: T0, Continual Learning, Instruction Tuning, Multitask Prompt Training

TL;DR: Continual Learning of Instruction Tuned models

Abstract: Recent work on large language models relies on the intuition that most natural language processing tasks can be described via natural language instructions. Language models trained on these instructions show strong zero-shot performance on several standard datasets. However, these models even though impressive can still perform poorly on a wide range of tasks outside of their respective training and evaluation sets and/or can be prohibitively large. A natural solution to address this limitation is Continual Learning: a model that could keep extending its knowledge and abilities, without forgetting previous skills. In spite of the limited success of Continual Learning, we show that fine-tuned language models can be continual learners. Our resulting model Continual-T0 (CT0) is able to learn 8 different and diverse tasks, while still achieving similar zero-shot performance on T0 evaluation tasks. As an additional finding, we notice that CT0 can generalize to instruction composition, being able to combine instructions in ways it was never trained for.

1 Reply

Loading