Keywords: Explainable Reinforcement Learning, Swarm Intelligence, Multi-Agent Reinforcement Learning
TL;DR: This paper introduces an explaination framework to explain the emergence of collective behaviors from multi-agent reinforcement learning.
Abstract: Recent studies have shown that multiple agents trained through reinforcement
learning can surprisingly exhibit swarm behaviors from simple rewards, without any
rewards specifically encouraging aggregation. Explaining how complex collective
behavior emerges from these simple rewards is an intriguing research problem,
but the underlying process remains a black box up to now. This paper aims to
reveal the hidden rules in this process. Specifically, we discovered that the reason
agents are able to develop complex behaviors from simple rewards is that they
implicitly learn the geometric fields of the environment and utilize these structures
as desired targets for coordinated movement. This finding is supported by two
distinct tasks: a competitive predator-prey pursuit and a cooperative multi-robot
shape assembly. 1) In the competitive environment, prey agents surprisingly converge
toward the boundary of the predators’ Voronoi diagram, demonstrating that they
are able to spontaneously learn Voronoi diagrams without any guided rewards. To
gain the above insights, we propose a two-stage EEC (Ego-observation → Egobehavior
→ Collective-behavior) explanatory framework. This includes a novel
analytical tool called the Agent Response Map (ARM), which reveals agents’
decision-making patterns across space and identifies regions of aggregation and
avoidance. 2) The proposed method is extended to a more realistic
and challenging cooperative robot-swarm task: Shape assembly, to validate its
generality and practical utility. The insights and tools presented in this paper
may provide a new perspective on the connection between AI-driven multi-agent
systems and real-world biological systems.
Supplementary Material: zip
Primary Area: interpretability and explainable AI
Submission Number: 24646
Loading