Abstract: Continual learning (CL) learns a sequence of tasks incrementally with the goal
of achieving two main objectives: overcoming catastrophic forgetting (CF) and
encouraging knowledge transfer (KT) across tasks. However, most existing techniques focus only on overcoming CF and have no mechanism to encourage KT,
and thus do not do well in KT. Although several papers have tried to deal with
both CF and KT, our experiments show that they suffer from serious CF when
the tasks do not have much shared knowledge. Another observation is that most
current CL methods do not use pre-trained models, but it has been shown that such
models can significantly improve the end task performance. For example, in natural
language processing, fine-tuning a BERT-like pre-trained language model is one of
the most effective approaches. However, for CL, this approach suffers from serious
CF. An interesting question is how to make the best use of pre-trained models for
CL. This paper proposes a novel model called CTR to solve these problems. Our
experimental results demonstrate the effectiveness of CTR.
0 Replies
Loading