M³HF: Multi-agent Reinforcement Learning from Multi-phase Human Feedback of Mixed Quality

Ziyan Wang; Zhicheng Zhang; Fei Fang; Yali Du

M³HF: Multi-agent Reinforcement Learning from Multi-phase Human Feedback of Mixed Quality

Ziyan Wang, Zhicheng Zhang, Fei Fang, Yali Du

Published: 01 May 2025, Last Modified: 15 Aug 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Designing effective reward functions in multi-agent reinforcement learning (MARL) is a significant challenge, often leading to suboptimal or misaligned behaviors in complex, coordinated environments. We introduce Multi-agent Reinforcement Learning from Multi-phase Human Feedback of Mixed Quality ($\text{M}^3\text{HF}$), a novel framework that integrates multi-phase human feedback of mixed quality into the MARL training process. By involving humans with diverse expertise levels to provide iterative guidance, $\text{M}^3\text{HF}$ leverages both expert and non-expert feedback to continuously refine agents' policies. During training, we strategically pause agent learning for human evaluation, parse feedback using large language models to assign it appropriately and update reward functions through predefined templates and adaptive weights by using weight decay and performance-based adjustments. Our approach enables the integration of nuanced human insights across various levels of quality, enhancing the interpretability and robustness of multi-agent cooperation. Empirical results in challenging environments demonstrate that $\text{M}^3\text{HF}$ significantly outperforms state-of-the-art methods, effectively addressing the complexities of reward design in MARL and enabling broader human participation in the training process.

Lay Summary: Coordinating multiple AI agents on complex tasks is hard without clear guidance, leading to slow learning and poor teamwork. We leverage human language feedback—using a large language model to convert simple comments into reward signals and relabel past experiences with these language-driven rewards. This accelerates training and boosts success rates in teamwork benchmarks by focusing agents on helpful behaviors. Our method makes multi-agent learning faster, more reliable, and easier to interpret, paving the way for transparent, efficient AI teams.

Primary Area: Reinforcement Learning->Multi-agent

Keywords: Multi-agent Reinforcement Learning, Human Feedback

Flagged For Ethics Review: true

Submission Number: 1392

Loading