Learning Partner Selection Rules that Sustain Cooperation in Social Dilemmas with the Option of Opting Out

Published: 01 Jan 2024, Last Modified: 15 May 2025AAMAS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We study populations of self-interested agents playing a 2-person repeated Prisoner's Dilemma game, with each player having the option of opting out of the interaction and choosing to be randomly assigned to another partner instead. The partner selection component makes these games akin to random matching, where defection is known to take over the entire population. Results in the literature have shown that, when forcing agents to obey a set partner selection rule known as Out-for-Tat, where defectors are systematically being broken ties with, cooperation can be sustained in the long run. In this paper, we remove this assumption and study agents that learn both action- and partner-selection strategies. Through multi-agent reinforcement learning, we show that cooperation can be sustained without forcing agents to play predetermined strategies. Our simulations show that agents are capable of learning in-game strategies by themselves, such as Tit-for-Tat. What is more, they are also able to simultaneously discover cooperation-sustaining partner selection rules, notably Out-for-Tat, as well as other new rules that make cooperation prevail.
Loading