Keywords: Multi-agent System, Shared Memory, Transformer
Abstract: Coordination in decentralized multi-agent reinforcement learning (MARL) necessitates that agents share information about their behavior and intentions. Existing approaches rely on communication protocols with domain or resource constraints or centralized training that poorly scales to large agent populations. We introduce the Shared Recurrent Memory Transformer (SRMT), which enables coordination through unconstrained communication. SRMT provides a global memory workspace where agents broadcast their learned working memory states and query others' memory representations to exchange information and coordinate while maintaining decentralized training and execution. We evaluate SRMT on the Partially Observable Multi-Agent Pathfinding (PO-MAPF) problem, where coordination is vital for optimal path planning and deadlock avoidance. We demonstrate that shared memory enables emergent coordination even when the reward function provides minimal or no guidance. On the specifically constructed Bottleneck task that requires negotiation, SRMT consistently outperforms communicative and memory-augmented baselines, particularly under sparse reward signals, and successfully generalizes to longer corridors unseen during training. On POGEMA maps, SRMT scales with the increasing agents' population and map size, achieving competitive performance with recent MARL, hybrid, and planning-based methods while requiring no domain-specific heuristics. These results demonstrate that a transformer with shared recurrent memory enhances coordination in decentralized multi-agent systems.
Supplementary Material: zip
Primary Area: applications to robotics, autonomy, planning
Submission Number: 24875
Loading