Explaining Contrastive Models using Exemplars: Explanation, Confidence, and Knowledge Limits

23 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Explainable AI, Contrastive Learning, Exemplars, Confidence, Knowledge Limits, OOD
TL;DR: Explaining Contrastive Models using Exemplars
Abstract: Explainable AI (XAI) provides human users with transparency and interpretability of powerful ``black-box'' models. Recent work on XAI has focused on explaining specific model responses by identifying key input features using attribution analysis. Another avenue for explaining AI decisions is to leverage exemplars of training data. However, there are limited investigations on using exemplars to establish metrics for confidence and knowledge limits. Recently, contrastive learning has received increased focus in computer vision, natural language, audio, and many other fields. However, there are very few explainability studies that could leverage the learning process to explain the contrastive models. In this paper, we advance post-hoc explainable AI for contrastive models. The main contributions include i) explaining the relation among test and training data samples using pairwise attribution analysis, ii) developing exemplar-based confidence metrics, and iii) establishing measures for the model knowledge limits. In the experimental evaluation, we evaluate the proposed techniques using the OpenAI CLIP model. The evaluation on ImageNet demonstrates that exemplars of training data can provide meaningful explanations for the decision-making of contrastive models. We observe that the proposed exemplar-based confidence score gives a more reliable, dataset-agnostic probability measure compared to the softmax score and temperature scaling. Furthermore, the OOD detection module of our framework shows significant improvement compared to other state-of-the-art methods (6.1\% and 9.6\% improvement in AUROC and FPR@95TPR, respectively). The three modules together can give a meaningful explanation of the model decisions made by a contrastive model. The proposed techniques extend the body of science of XAI for contrastive models and are expected to impact the explainability of future foundational models.
Primary Area: visualization or interpretation of learned representations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8039
Loading