PRE: Vision-Language Prompt Learning with Reparameterization Encoder

ICLR 2024 Workshop DMLR Submission102 Authors

Published: 04 Mar 2024, Last Modified: 02 May 2024DMLR @ ICLR 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: prompt learning; domain specific data; few-shot learning; vision-language foundation models; CLIP
Abstract: Large vision-language foundation models such as CLIP have demonstrated great potential in zero-shot transferabil- ity to downstream tasks. However, manual prompt engineering is the major challenge for deploying such models in practice since it requires domain expertise and extreme time. To avoid non- trivial prompt engineering, recent work Context Optimization (CoOp) introduced the concept of prompt learning to the vision domain using learnable textual tokens. While CoOp can achieve substantial improvements over manual prompts, its learned context is worse generalizable to wider unseen classes within the same dataset. In this work, we present Prompt Learning with Reparameterization Encoder (PRE) - a simple and efficient method that enhances the generalization ability of the learnable prompt to unseen classes in practical domains. Instead of directly optimizing the prompts, PRE employs a prompt encoder to reparameterize the input prompt embeddings, enhancing the exploration of domain-specific knowledge from few-shot data. Experiments and extensive ablation studies on 8 benchmarks demonstrate that our approach is an efficient method for prompt learning in vision-language foundation models. Specifically, PRE achieves a notable enhancement of 5.60% in average accuracy on New classes and 3% in Harmonic mean compared to CoOp in the 16-shot setting
Primary Subject Area: Role of data in foundation models: pre-training, prompting, fine-tuning
Paper Type: Research paper: up to 8 pages
DMLR For Good Track: Participate in DMLR for Good Track
Participation Mode: In-person
Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.
Submission Number: 102
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview