D3C: Reducing the Price of Anarchy in Multi-Agent Learning

Ian Gemp; Kevin McKee; Richard Everett; Edgar Alfredo Duenez-Guzman; Yoram Bachrach; David Balduzzi; Andrea Tacchetti

D3C: Reducing the Price of Anarchy in Multi-Agent Learning

Ian Gemp, Kevin McKee, Richard Everett, Edgar Alfredo Duenez-Guzman, Yoram Bachrach, David Balduzzi, Andrea Tacchetti

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: multiagent, social dilemma, reinforcement learning

Abstract: Even in simple multi-agent systems, fixed incentives can lead to outcomes that are poor for the group and each individual agent. We propose a method, D3C, for online adjustment of agent incentives that reduces the loss incurred at a Nash equilibrium. Agents adjust their incentives by learning to mix their incentive with that of other agents, until a compromise is reached in a distributed fashion. We show that D3C improves outcomes for each agent and the group as a whole in several social dilemmas including a traffic network with Braess’s paradox, a prisoner’s dilemma, and several reinforcement learning domains.

One-sentence Summary: We propose a decentralized, gradient-based meta-algorithm to adapt the losses of agents in a multi-agent system such that the price of anarchy is reduced.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Reviewed Version (pdf): https://openreview.net/references/pdf?id=YRtyTv3BAg

13 Replies

Loading