Encoding Goals as Graphs: Structured Objectives for Scalable Cooperative Multi-Agent Reinforcement Learning

Alessandro Amato; Raffaele Galliera; K. Brent Venable; Niranjan Suri

Encoding Goals as Graphs: Structured Objectives for Scalable Cooperative Multi-Agent Reinforcement Learning

Alessandro Amato, Raffaele Galliera, K. Brent Venable, Niranjan Suri

Published: 23 Jun 2025, Last Modified: 08 Aug 2025CoCoMARL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: MARL, Goal-conditioned MARL, Goal Embeddings, Graph Embeddings, Contrastive Representation Learning

TL;DR: GEMA (Graph Embeddings for Multi-Agent Coordination) uses graph-structured objectives to pre-train an encoder that embeds current and goal state graphs into a metric space. Agents get similarity scores as observation features and intrinsic rewards.

Abstract: Many cooperative multi-agent tasks are naturally defined by graph-structured objectives, where agents must collectively reach, for example, a desired relational configuration or satisfy a set of constraints. These objectives often encode spatial arrangements, inter-agent relations, or constraints that can be formalized as target graphs. However, current goal-conditioned multi-agent reinforcement learning (MARL) algorithms do not employ these symbolic and structured representations to direct their agents towards effective strategies. We propose Graph Embeddings for Multi-Agent coordination (GEMA), which couples any cooperative learner with a State–Goal Graph Encoder (SGE). The SGE is contrastively pre-trained to embed state graphs in a common metric space. At run time each agent builds the state graph, queries the SGE, and computes a scalar distance to the broadcast goal embedding. This distance is appended to the agent's observation and converted into an intrinsic reward signal, providing the agent with progress information. Experiments on two benchmarks show that GEMA accelerates convergence and boosts team returns, outperforming strong MARL baselines across all scenarios.

Submission Number: 15

Loading