Diverse Conventions for Human-AI Collaboration

Bidipta Sarkar; Andy Shih; Dorsa Sadigh

Diverse Conventions for Human-AI Collaboration

Bidipta Sarkar, Andy Shih, Dorsa Sadigh

Published: 21 Sept 2023, Last Modified: 02 Nov 2023NeurIPS 2023 posterEveryoneRevisionsBibTeX

Keywords: Multi-Agent RL, Multi-Agent Coordination, Human-AI Coordination

TL;DR: We present CoMeDi, a technique for generating a diverse set of good-faith conventions in multi-agent coordination games, and demonstrate its utility for human-AI collaboration.

Abstract: Conventions are crucial for strong performance in cooperative multi-agent games, because they allow players to coordinate on a shared strategy without explicit communication. Unfortunately, standard multi-agent reinforcement learning techniques, such as self-play, converge to conventions that are arbitrary and non-diverse, leading to poor generalization when interacting with new partners. In this work, we present a technique for generating diverse conventions by (1) maximizing their rewards during self-play, while (2) minimizing their rewards when playing with previously discovered conventions (cross-play), stimulating conventions to be semantically different. To ensure that learned policies act in good faith despite the adversarial optimization of cross-play, we introduce mixed-play, where an initial state is randomly generated by sampling self-play and cross-play transitions and the player learns to maximize the self-play reward from this initial state. We analyze the benefits of our technique on various multi-agent collaborative games, including Overcooked, and find that our technique can adapt to the conventions of humans, surpassing human-level performance when paired with real users.

Supplementary Material: zip

Submission Number: 9068

Loading