Introducing Common Null Space of Gradients for Gradient Projection Methods in Continual Learning

Published: 20 Jul 2024, Last Modified: 06 Aug 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Continual learning aims to learn new knowledge from a sequence of tasks without forgetting. Recent studies have found that projecting gradients onto the orthogonal direction of task-specific features is effective. However, these methods mainly focus on mitigating catastrophic forgetting by adopting old features to construct projection spaces, neglecting the potential to enhance plasticity and the valuable information contained in previous gradients. To enhance plasticity and effectively utilize the gradients from old tasks, we propose Gradient Projection in Common Null Space (GPCNS), which projects current gradients into the common null space of final gradients under all preceding tasks. Moreover, to integrate both feature and gradient information, we propose a collaborative framework that allows GPCNS to be utilized in conjunction with existing gradient projection methods as a plug-and-play extension that provides gradient information and better plasticity. Experimental evaluations conducted on three benchmarks demonstrate that GPCNS exhibits superior plasticity compared to conventional gradient projection methods. More importantly, GPCNS can effectively improve the backward transfer and average accuracy for existing gradient projection methods when applied as a plugin, which outperforms all the gradient projection methods without increasing learnable parameters and customized objective functions. The code is available at https://github.com/Hifipsysta/GPCNS.
Primary Subject Area: [Content] Vision and Language
Secondary Subject Area: [Content] Media Interpretation
Relevance To Conference: Continual learning aims to learn new knowledge from a sequence of tasks without forgetting, which has been proposed and applied to computer vision, natural language processing and multi-modal scenarios, including object detection, semantic segmentation, relation extraction, neural machine translation, cross-modal retrieval and visual question answering. We propose GPCNS to addresses the limitations of gradient projection method in lacking plasticity and the neglecting the utilization of gradients from old tasks in continual learning, thereby making contributions to the fields of computer vision (CV), natural language processing (NLP) as well as multimodality.
Supplementary Material: zip
Submission Number: 1214
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview