Integrating Counterfactual Simulations with Language Models for Explaining Multi-Agent Behaviour
Keywords: Multi-Agent Systems, Explainable AI, Causality, Large Language Models, Autonomous Driving
TL;DR: The paper shows that we can accurately and actionably explain actions in multi-agent systems by combining an LLM with a simulator to explore interventions on agents' actions in counterfactual worlds.
Abstract: Autonomous multi-agent systems (MAS) are useful for automating complex tasks but raise trust concerns due to risks such as miscoordination or goal misalignment. Explainability is vital for users' trust calibration, but explainable MAS face challenges due to complex environments, the human factor, and non-standardised evaluation. Leveraging the counterfactual effect size model and LLMs, we propose _**A**gentic e**X**planations via **I**nterrogative **S**imulation (AXIS)_. AXIS generates human-centred action explanations for multi-agent policies by having an LLM interrogate an environment simulator using prompts like '_whatif_' and '_remove_' to observe and synthesise counterfactual information over multiple rounds. We evaluate AXIS on autonomous driving across ten scenarios for five LLMs with a comprehensive methodology combining robustness, subjective preference, correctness, and goal/action prediction with an external LLM as evaluator. Compared to baselines, AXIS improves perceived explanation correctness by at least 7.7% across all models and goal prediction accuracy by 23% for four models, with comparable action prediction accuracy, achieving the highest scores overall.
Area: Representation and Reasoning (RR)
Generative A I: I acknowledge that I have read and will follow this policy.
Submission Number: 501
Loading