KEEP: Towards a Knowledge-Enhanced Explainable Prompting Framework for Vision-Language Models

14 Sept 2024 (modified: 22 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Prompt, Domain Knowledge, VLM, XAI
TL;DR: A knowledge-enhanced explainable prompting framework for vision-language models
Abstract: Large-scale vision-language models (VLMs) embedded with expansive representations and visual concepts have showcased significant potential in the computer vision community. Efficiently adapting VLMs such as CLIP, to downstream tasks has garnered growing attention, with prompt learning emerging as a representative approach. However, most existing prompt-based adaptation methods, which rely solely on coarse-grained textual prompts, suffer from limited performance and interpretability when handling tasks that require domain-specific knowledge. This results in a failure to satisfy the stringent trustworthiness requirements of Explainable Artificial Intelligence (XAI) in high-risk scenarios like healthcare. To address this issue, we propose a Knowledge-Enhanced Explainable Prompting (KEEP) framework that leverages fine-grained domain-specific knowledge to enhance the adaptation process across various domains, facilitating bridging the gap between the general domain and other specific domains. We present to our best knowledge the first work to incorporate retrieval augmented generation and domain-specific foundation models to provide more reliable image-wise knowledge for prompt learning in various domains, alleviating the lack of fine-grained annotations, while offering both visual and textual explanations. Extensive experiments and explainability analyses conducted on eight datasets of different domains, demonstrate that our method simultaneously achieves superior performance and interpretability, shedding light on the effectiveness of the collaboration between foundation models and XAI. The code will be made publically available.
Supplementary Material: pdf
Primary Area: interpretability and explainable AI
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 673
Loading