ReMAC: Large Language Model-Driven Reward Design for Multi-Agent Manipulation Collaboration

Pengyi Li; Hongyao Tang; Yifu Yuan; Jianye HAO

ReMAC: Large Language Model-Driven Reward Design for Multi-Agent Manipulation Collaboration

Pengyi Li, Hongyao Tang, Yifu Yuan, Jianye HAO

Published: 28 Sept 2025, Last Modified: 09 Oct 2025SEA @ NeurIPS 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multi-Agent Reinforcement Learning, Reward Design, Large Language Model, Multi-Agent Benchmark

Abstract: Multi-agent collaboration, such as in multi-robot systems, often relies on carefully crafted reward functions. These functions are crucial for learning collaborative policies. However, designing efficient reward functions for multi-agent systems remains an open challenge. To bridge this gap, we propose ReMAC, a novel large language model-driven framework for Reward generation in Multi-Agent Collaboration. ReMAC employs a hierarchical approach to generate and optimize multi-agent reward functions: The upper level maintains and iteratively optimizes a population of reward functions from both team-level and individual-agent perspectives. The lower level applies multi-agent reinforcement learning algorithms (MARL) to derive effective collaborative policies. This hierarchical design ensures efficient learning and optimization of multi-agent policies. Motivated by recent advances in robotics, especially in embodied AI, we observe that existing multi-agent benchmarks fall short in supporting collaborative manipulation tasks. To bridge this gap, we design the Multi-Agent Manipulation Collaboration benchmark, ManiCraft, aiming to advance research on robotic manipulation in the MARL community. Experimental results demonstrate that ReMAC successfully constructs high-quality reward functions that outperform even those manually designed by human experts.

Archival Option: The authors of this submission do *not* want it to appear in the archival proceedings.

Submission Number: 67

Loading