CLIP model is an Efficient Online Continual Learner

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: online continual learning, vision-language models, CLIP, task-agnostic continual learning
TL;DR: Ensures that image-text symmetry can effectively mitigate catastrophic forgetting in CLIP online continual learning.
Abstract: Online continual learning addresses the challenge of learning from continuous, non-stationary data streams. Existing online continual learning frameworks are classification-based and assume a pre-defined number of classes. In this study, we propose that vision-language models (VLMs) are more suitable candidates for online continual learning. Compared to traditional classification-based frameworks, VLM such as CLIP model is not limited by the maximum number of classes or constrained by rigid model architectures, enabling it to generalize across both known and emerging classes. However, we find that naively tuning the CLIP for online continual learning results in asymmetric image-text matching. This asymmetric matching will consistently poses negative suppression on the previously learned classes, leading to catestrophic forgetting. To address this issue, we propose a simple yet effective method, the symmetric image-text (SIT) tuning strategy, which mitigates the adverse impact of negative samples by excluding asymmetric text during online learning. Additionally, we introduce a more challenging online continual learning setting with blurred boundary, namely MiD-Blurry, which mixes multiple data distributions to simulate real-world scenarios. We conduct extensive experiments on several continual learning benchmarks as well as the MiD-Blurry setting, evaluating both inference-at-any-time performance and generalization to future data. Our results demonstrate that the SIT strategy effectively preserves memory stability while maintaining learning plasticity.
Supplementary Material: zip
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 6298
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview