Revive and Recouple: Mitigating Plasticity Loss in Transformer Architectures

ICLR 2026 Conference Submission21284 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: transformers, plasticity loss, continual learning, dormant neuron
Abstract: A key trait of general intelligence is the ability to continuously adapt and learn in non-stationary environments. Neural networks progressively lose their ability to learn in such settings, this phenomenon is known as plasticity loss. Existing mitigation techniques for plasticity loss either require extensive additional memory, compute, or suffer from loss of crucial task information, causing a performance drop. Crucially, while this phenomenon is well-studied in traditional MLPs, there is a significant lack of insight regarding loss of plasticity in transformer architectures. In this work, we find that plasticity loss also occurs in transformer architectures, both in dense layers and layer norm parameters. To address this, we present a novel two-step framework Revive and Recouple (RnR) designed to mitigate plasticity loss while preserving crucial knowledge, thereby avoiding performance drops. Our experiments show that RnR significantly outperforms current approaches on transformer architectures in Continual Learning (CL) scenarios.
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 21284
Loading