Language Grounded Multi-agent Reinforcement Learning with Human-interpretable Communication

Huao Li; Hossein Nourkhiz Mahjoub; Behdad Chalaki; Vaishnav Tadiparthi; Kwonjoon Lee; Ehsan Moradi Pari; Charles Michael Lewis; Katia P. Sycara

Language Grounded Multi-agent Reinforcement Learning with Human-interpretable Communication

Huao Li, Hossein Nourkhiz Mahjoub, Behdad Chalaki, Vaishnav Tadiparthi, Kwonjoon Lee, Ehsan Moradi Pari, Charles Michael Lewis, Katia P. Sycara

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY-NC-ND 4.0

Keywords: Multi-Agent Reinforcement Learning, Emergent Communication, Ad-hoc Teamwork, Large Language Models

TL;DR: We propose a novel computational pipeline to ground MARL communication in human language using embodied LLM agents, enabling interpretable and generalizable communication in ad-hoc multi-agent teamwork.

Abstract: Multi-Agent Reinforcement Learning (MARL) methods have shown promise in enabling agents to learn a shared communication protocol from scratch and accomplish challenging team tasks. However, the learned language is usually not interpretable to humans or other agents not co-trained together, limiting its applicability in ad-hoc teamwork scenarios. In this work, we propose a novel computational pipeline that aligns the communication space between MARL agents with an embedding space of human natural language by grounding agent communications on synthetic data generated by embodied Large Language Models (LLMs) in interactive teamwork scenarios. Our results demonstrate that introducing language grounding not only maintains task performance but also accelerates the emergence of communication. Furthermore, the learned communication protocols exhibit zero-shot generalization capabilities in ad-hoc teamwork scenarios with unseen teammates and novel task states. This work presents a significant step toward enabling effective communication and collaboration between artificial agents and humans in real-world teamwork settings.

Supplementary Material: zip

Primary Area: Human-AI interaction

Submission Number: 11830

Loading