Towards Faithful Agentic XAI

Towards Faithful Agentic XAI

ICLR 2026 Conference Submission17836 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Explainable AI

Abstract: Explainable AI (XAI) is essential for helping users interpret model behavior and proactively identify potential faults. Recently, Agentic XAI systems that integrate Large Language Models (LLMs) have emerged to make explanations more accessible for non-expert users through natural language. However, a critical limitation of the existing systems is their failure to address explanation faithfulness. This is problematic because many XAI methods are often unfaithful for complex models, and LLMs can amplify this incorrect information, ultimately misleading users. To address this limitation, we propose Faithful Agentic XAI (FAX), a framework that actively enhances explanation faithfulness. FAX introduces a systematic verification process where an LLM agent cross-checks claims against inherently faithful tools. This process filters out unreliable or contradictory evidence and leads to more faithful explanations. For evaluation, we propose CRAFTER-XAI-Bench, a benchmark framework built on an open-world reinforcement learning environment. The benchmark features complex models with diverse goals and challenging test scenarios, enabling a rigorous assessment of explanation faithfulness under realistic conditions. Experiments demonstrate that FAX significantly improves the faithfulness of explanations, marking a crucial step towards faithful and trustworthy Agentic XAI.

Supplementary Material: pdf

Primary Area: interpretability and explainable AI

Submission Number: 17836

Loading