Aligning Characteristic Descriptors with Images for Human-Expert-like Explainability

Published: 10 Oct 2024, Last Modified: 03 Dec 2024IAI Workshop @ NeurIPS 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Explainability, Vision Language Models, Face Recognition, X-ray Diagnosis
TL;DR: Framework which enables the deep learning models to provide human expert like explanations
Abstract: In mission-critical domains such as law enforcement and medical diagnosis, the ability to explain and interpret the outputs of deep learning models is crucial for en- suring user trust and supporting informed decision-making. Despite advancements in explainability, existing methods often fall short in providing explanations that mirror the depth and clarity of those given by human experts. Such expert-level explanations are essential for the dependable application of deep learning models in law enforcement and medical contexts. Additionally, we recognize that most explanations in real-world scenarios are communicated primarily through natu- ral language. Addressing these needs, we propose a novel approach that utilizes characteristic descriptors to explain model decisions by identifying their presence in images, thereby generating expert-like explanations. Our method incorporates a concept bottleneck layer within the model architecture, which calculates the similarity between image and descriptor encodings to deliver inherent and faithful explanations. Through experiments in face recognition and chest X-ray diagnosis, we show that our approach offers a significant contrast over existing techniques, which are often limited to the use of saliency maps. Our approach represents a sig- nificant step toward making deep learning systems more accountable, transparent, and trustworthy in the critical domains of face recognition and medical diagnosis.
Track: Main track
Submitted Paper: No
Published Paper: No
Submission Number: 61
Loading