Enhancing Target-unspecific Tasks through a Features Matrix

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: Enhancing Target-unspecific Tasks through a Features Matrix
Abstract: Recent developments in prompt learning of large Vision-Language Models (VLMs) have significantly improved performance in target-specific tasks. However, these prompting methods often struggle to tackle the target-unspecific or generalizable tasks effectively. It may be attributed to the fact that overfitting training causes the model to forget its general knowledge. The general knowledge has a strong promotion on target-unspecific tasks. To alleviate this issue, we propose a novel Features Matrix (FM) approach designed to enhance these models on target-unspecific tasks. Our method extracts and leverages general knowledge, shaping a Features Matrix (FM). Specifically, the FM captures the semantics of diverse inputs from a deep and fine perspective, preserving essential general knowledge, which mitigates the risk of overfitting. Representative evaluations demonstrate that: 1) the FM is compatible with existing frameworks as a generic and flexible module, and 2) the FM significantly showcases its effectiveness in enhancing target-unspecific tasks (base-to-novel generalization, domain generalization, and cross-dataset generalization), achieving state-of-the-art performance.
Lay Summary: Recent advancements in training large Vision-Language Models (VLMs) using prompts have greatly boosted their performance in specific tasks. However, these methods often struggle with more general tasks that require broader understanding. One reason behind this challenge is that during training, the model can become too focused on specific details, forgetting its broader knowledge. This general knowledge is crucial for handling tasks that are not narrowly defined. To address this issue, we introduce a new approach called the Features Matrix (FM) method, aimed at improving VLMs on general tasks. Our method works by extracting and utilizing this general knowledge to create a Features Matrix (FM). The FM captures the deep semantics of various inputs, preserving essential knowledge and reducing the risk of becoming too specialized. Through thorough evaluations, we have found that the FM can seamlessly integrate with existing frameworks as a versatile module. Moreover, it significantly enhances performance on general tasks such as base-to-novel generalization, domain generalization, and cross-dataset generalization, achieving top-tier results.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Primary Area: General Machine Learning->Supervised Learning
Keywords: Prompt learning
Submission Number: 123
Loading