Towards Faithful Agentic XAI: A Verification-centric Workflow and an Open-world Benchmark

Towards Faithful Agentic XAI: A Verification-centric Workflow and an Open-world Benchmark

ACL ARR 2026 January Submission2826 Authors

03 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Explainable AI, LLM

Abstract: Explainable AI (XAI) is essential for helping users interpret model behavior and proactively identify potential faults. Agentic XAI systems that integrate Large Language Models (LLMs) have emerged to make explanations more accessible for non-expert users through natural language. A critical limitation of the existing systems is that they often generate plausible but unfaithful explanations. This is problematic because many XAI methods are often unfaithful for complex models, and LLMs can amplify this incorrect information, ultimately misleading users. To address this limitation, we propose Faithful Agentic XAI (FAX), a framework that actively enhances explanation faithfulness. FAX introduces a systematic verification process where an LLM agent cross-checks claims against inherently faithful tools. This process filters out unreliable or contradictory claims and leads to more faithful explanations. We also propose CRAFTER-XAI-Bench, a benchmark framework built on an open-world reinforcement learning environment. The benchmark features complex models with diverse goals and challenging test scenarios, enabling a rigorous assessment of explanation faithfulness under realistic conditions. Experiments demonstrate that FAX significantly improves the faithfulness of explanations, marking a crucial step towards faithful and trustworthy Agentic XAI.

Paper Type: Long

Research Area: AI/LLM Agents

Research Area Keywords: explanation faithfulness, LLM/AI agents

Contribution Types: Model analysis & interpretability

Languages Studied: English

Submission Number: 2826

Loading