Power Norm Based Lifelong Learning for Paraphrase GenerationsDownload PDF

Anonymous

16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: Seq2seq language generation models are trained with multiple domains in a continue learning manner, where the data from each domain being observed in an online fashion. However, continual learning studies usually suffer a lot from catastrophic forgetting, a persistent challenge for lifelong learning. To handle this problem, existing work has leveraged experience replay or dynamic architecture to consolidate the past knowledge, which however result in incremental memory space or high computational cost. In this work, we propose an innovative framework PNLL that remedies the catastrophic forgetting issues with a power normalization on NLP transformer models. Specifically, PNLLL leverages power norm to achieve a better balance between past experience rehearsal and new knowledge acquisition. Our experiments on, paraphrase generation, show that PNLLL outperforms SOTA models by a considerable margin and remedy forgetting greatly.
0 Replies

Loading