Keywords: Zero-shot cooperation, Cooperative RL, Human-AI Teaming
Abstract: The long-standing research challenge of Zero-shot Cooperation (ZSC) have been tackled by applying cooperative reinforcement learning to train an agent by optimizing the environment reward function and evaluating their performance through task performance metrics such as task reward. However, such evaluation focuses only on task completion, while being agnostic to `how' the two agents work with each other. Specifically, we are interested in understanding the cooperative behaviors arising within a team - a problem that has been overlooked by the existing literature in MARL. To formally address this problem, we propose the concept of constructive interdependence - measuring how much agents rely on each other’s actions to achieve the shared goal - as a key metric for evaluating cooperation in teams. We interpret interdependence in terms of action interactions in a STRIPS formalism, and define metrics that allow us to assess the degree of reliance between the agents' actions. We pair state-of-the-art ZSC agents with other agents for the popular Overcooked domain, and evaluate the task reward and teaming performance for such teams. Our results demonstrate that although trained agents attain high task rewards, they fail to induce cooperative behavior, showing very low levels of interdependence across teams. Furthermore, our analysis reveals that teaming performance is not necessarily correlated with task reward, highlighting that task reward alone cannot reliably measure cooperation arising in a team.
Submission Number: 19
Loading