Continually learn to map visual concepts to language models in resource-constrained environments

Published: 01 Jan 2025, Last Modified: 28 Sept 2025Neurocomputing 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•A small visual model can be trained with knowledge space created by a frozen LM.•CVM improves performance and mitigating forgetting in standard benchmarks.•We study the generalization and transfer capabilities of our proposal.•CL method based on a large pre-trained model fails in fine-grained datasets.•CVM achieves similar results with lower inference time in fine-grained datasets.
Loading