OpenPL: Realistic Evaluation of Prompt Learning for VLM in Open Environments

Zi-Kang Wang; Song-Lin Lv; Hao-Zhe Tan; Zhi Zhou; Yu-Feng Li; Lan-Zhe Guo

OpenPL: Realistic Evaluation of Prompt Learning for VLM in Open Environments

Zi-Kang Wang, Song-Lin Lv, Hao-Zhe Tan, Zhi Zhou, Yu-Feng Li, Lan-Zhe Guo

28 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: VLM; Prompt Learning; Open environments

Abstract: Vision-language models (VLMs) have demonstrated impressive zero-shot capabilities across various image classification tasks. Their performance can be further enhanced through prompt learning methods. To evaluate the effectiveness of prompt learning, it is important to assess its robustness to new classes and distributional shifts. However, current studies typically assume single data distribution shifts and pre-known new class space, which still have gaps with real-world open environments where data distributions and classes are often uncertain and subject to continuous change. To better analyze the robustness of prompt learning methods in more realistic scenarios, we propose a novel evaluation benchmark called OpenPL from the following perspectives: 1) We reconstruct multiple scenarios of open environments, encompassing dynamic class changes, dynamic distribution shifts, and dynamic co-evolution of both distribution and classes; 2) We propose a series of new performance metrics for prompt learning methods based on the Dynamic Robustness Curve (DRC) to better understand their robustness in open environments; 3) We re-implement diverse prompt learning methods and evaluate their performance on the proposed OpenPL benchmark. The results show that no current prompt learning method is robust to open environments and no meaningful performance improvement is achieved compared to the zero-shot performance, designing robust prompt learning methods remains a difficult task. All re-implementations are available at \url{https://anonymous.4open.science/r/OpenPL-565E}.

Supplementary Material: zip

Primary Area: datasets and benchmarks

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 13713

Loading