Lifelong Language Learning with Adapter based Transformers

Shadab Khan; Surbhi Agarwal; P. K. Srijith

Lifelong Language Learning with Adapter based Transformers

Shadab Khan, Surbhi Agarwal, P. K. Srijith

Published: 18 Nov 2022, Last Modified: 05 May 2023CLL@ACML2022Readers: Everyone

Keywords: Continual Learning, Lifelong Language Learning, Adapter Transformer, Natural Language Processing

TL;DR: Transformer with Adapter Modules that sequentially learns new NLP tasks in various domains and prevents catastrophic forgetting without retraining the model from scratch

Abstract: Continual Learning is important for real-world natural language processing applications, where computational systems are required to interact with continuous streams of tasks and language over time. When forced to adapt to new tasks and inputs, language models experience catastrophic forgetting. The current generative replay-based algorithms are not scalable to many tasks, and their performance may degrade from a change in the task order. In this paper, we propose a model based on network growth - a pre-trained Transformer with Adapter modules for each task - that sequentially learns new NLP tasks in various domains and prevents catastrophic forgetting without retraining the model from scratch. We train and maintain light weight adapter modules sequentially for each task. Without increasing network growth by more than 15% and avoiding replay and task order bias, the current design allows us to increase average task accuracy by 1.3% over the baseline models.

1 Reply

Loading