From Models to Systems: A Survey of Explainability for Tool-Augmented Language Models and AI Agents

From Models to Systems: A Survey of Explainability for Tool-Augmented Language Models and AI Agents

ACL ARR 2026 January Submission6504 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Augmented Language Models, AI Agents, Explainability, Tool-using LLMs, Interpretability, Transparency, Reasoning

Abstract: Large language models (LLMs) are increasingly being used as part of complex agentic systems that orchestrate the use of external tools, such as retrieval mechanisms or code interpreters. In this survey, we argue that this development necessitates a rethinking of the goals of explainable artificial intelligence (XAI): Rather than focusing on providing users with explanations for monolithic machine learning models, we need system-level explanations that also provide information about which and how tools are used, as well as how external execution traces causally influence system behavior. We provide an overview of the existing methods in explainable AI and discuss the limitations of monolithic XAI methods in agentic contexts. Finally, we highlight open challenges in providing faithful explanations for LLM-based systems.

Paper Type: Long

Research Area: Special Theme (conference specific)

Research Area Keywords: interpretability, explanation faithfulness, feature attribution, free-text/natural language explanations, LLM agents, tool use

Contribution Types: Model analysis & interpretability, Position papers, Surveys

Languages Studied: Not applicable

Submission Number: 6504

Loading