Enhancing Multi-Agent Robustness: Addressing the Off-Diagonal Problem with Population-Based Training

Published: 01 Jan 2024, Last Modified: 13 Jul 2025ICTAI 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Cooperation with previously unseen agents in multi-agent cooperative scenario is a challenging problem. Agents trained together tend to rely on certain conventions discovered during the training that may not be present when they should cooperate with new agents. This leads to unsatisfying performance in such cases. We call this the off-diagonal problem. In this paper, we investigate this problem on more than 20 maps from the Overcooked environment. First, we propose and evaluate a number of metrics to quantify the off-diagonal problem and then we propose a population-based training technique to alleviate this problem. The results show that the proposed metrics can be used to divide the maps into groups based on how much they are affected by this problem, and the population-based training improves the performance of the agents especially in those maps, where the problem is present, without having negative consequences in the other maps.
Loading