Fairness and Cooperation between Independent Reinforcement Learners through Indirect Reciprocity

Published: 01 Jan 2024, Last Modified: 17 Oct 2024AAMAS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In a multi-agent setting, altruistic cooperation is costly yet socially desirable. As such, agents adapting through independent reinforcement learning struggle to converge to efficient, cooperative policies. Indirect reciprocity (IR) constitutes a possible mechanism to encourage cooperation by introducing reputations, social norms and the possibility that agents reciprocate based on past actions. IR has been mainly studied in homogeneous populations. In this paper, we introduce a model that allows for both reputation and group-based cooperation, and analyse how specific social norms (i.e. rules to assign reputations) can lead to varying levels of cooperation and fairness. We investigate how a finite population of independent Q-learning agents perform under different social norms. We observe that while norms such as Stern-Judging sustain both cooperation and fairness in populations of learning agents, other norms used to judge in- or out-group interactions can lead to unfair outcomes.
Loading