Keywords: Explainable AI, LLM
Abstract: Explainable AI (XAI) is essential for helping users interpret model behavior and proactively identify potential faults. Agentic XAI systems that integrate Large Language Models (LLMs) have emerged to make explanations more accessible for non-expert users through natural language.
A critical limitation of the existing systems is that they often generate plausible but unfaithful explanations.
This is problematic because many XAI methods are often unfaithful for complex models, and LLMs can amplify this incorrect information, ultimately misleading users.
To address this limitation, we propose Faithful Agentic XAI (FAX), a framework that actively enhances explanation faithfulness.
FAX introduces a systematic verification process where an LLM agent cross-checks claims against inherently faithful tools. This process filters out unreliable or contradictory claims and leads to more faithful explanations.
We also propose CRAFTER-XAI-Bench, a benchmark framework built on an open-world reinforcement learning environment. The benchmark features complex models with diverse goals and challenging test scenarios, enabling a rigorous assessment of explanation faithfulness under realistic conditions.
Experiments demonstrate that FAX significantly improves the faithfulness of explanations, marking a crucial step towards faithful and trustworthy Agentic XAI.
Paper Type: Long
Research Area: AI/LLM Agents
Research Area Keywords: explanation faithfulness, LLM/AI agents
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 2826
Loading