Encoding Goals as Graphs: Structured Objectives for Scalable Cooperative Multi-Agent Reinforcement Learning

Published: 19 Dec 2025, Last Modified: 05 Jan 2026AAMAS 2026 ExtendedAbstractEveryoneRevisionsBibTeXCC BY 4.0
Keywords: MARL, Goal-conditioned MARL, Goal Embeddings, Graph Embeddings, Contrastive Representation Learning
Abstract: Many cooperative multi-agent tasks are naturally defined by graph-structured objectives, where agents must collectively reach, for example, a desired relational configuration or satisfy a set of constraints. These objectives often encode spatial arrangements, inter-agent relations, or constraints that can be formalized as target graphs. However, current goal-conditioned multi-agent reinforcement learning (MARL) algorithms do not employ these symbolic and structured representations to direct their agents towards effective strategies. We propose Graph Embeddings for Multi-Agent coordination (GEMA), which couples any cooperative learner with a State–Goal Graph Encoder (SGE). The SGE is pre-trained in a contrastive manner to embed state graphs in a common metric space. At run time, each agent builds the state graph, queries the SGE, and computes a scalar distance to the goal embedding. The calculated distance can then be used as an intrinsic reward signal for the agents and incorporated into each agent's observation, providing feedback on the task progress. Experiments on challenging benchmarks for centralized training with decentralized execution, including cooperative navigation, load balancing, and the StarCraft Multi-Agent Challenge (v2), demonstrate that GEMA accelerates convergence and improves team returns, outperforming both standard and objective-driven state-of-the-art MARL baselines.
Area: Representation and Reasoning (RR)
Generative A I: I acknowledge that I have read and will follow this policy.
Submission Number: 1265
Loading