Dynamic Incentivized Cooperation under Changing Rewards

Published: 19 Dec 2025, Last Modified: 05 Jan 2026AAMAS 2026 ExtendedAbstractEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multi-Agent Reinforcement Learning, Emergent Cooperation, Peer Incentivization, Dynamic Rewards, Social Dilemmas
TL;DR: We introduce DRIVE, a decentralized peer incentivization method that exchanges reward differences to sustain cooperation under changing rewards, backed by theoretical invariance guarantees and empirical superiority over existing approaches.
Abstract: *Peer incentivization* (PI) is a popular multi-agent reinforcement learning approach where all agents can reward or penalize each other to achieve cooperation in social dilemmas. Despite their potential for scalable cooperation, current PI methods heavily depend on fixed incentive values that need to be appropriately chosen w.r.t. the environmental rewards and thus are highly sensitive to their changes. Therefore, they fail to maintain cooperation under *changing rewards* in the environment, e.g., caused by modified specifications, varying supply and demand, or sensory flaws — even when the conditions for mutual cooperation remain the same. In this paper, we propose *Dynamic Reward Incentives for Variable Exchange* (DRIVE), an adaptive PI approach to cooperation in social dilemmas with changing rewards. DRIVE agents reciprocally exchange reward differences to incentivize mutual cooperation in a completely decentralized way. We show how DRIVE achieves mutual cooperation in the general Prisoner's Dilemma and empirically evaluate DRIVE in more complex sequential social dilemmas with changing rewards, demonstrating its ability to achieve and maintain cooperation, in contrast to current state-of-the-art PI methods.
Area: Learning and Adaptation (LEARN)
Generative A I: I acknowledge that I have read and will follow this policy.
Submission Number: 784
Loading