Oracles and Followers: Stackelberg Equilibria in Deep Multi-Agent Reinforcement Learning

Matthias Gerstgrasser; David C. Parkes

Oracles and Followers: Stackelberg Equilibria in Deep Multi-Agent Reinforcement Learning

Matthias Gerstgrasser, David C. Parkes

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone

Keywords: Multi-Agent Reinforcement Learning, Game Theory, Security Games, Mechanism Design, Stackelberg Equilibrium, Indirect Mechanism Design

TL;DR: We show a general framework for learning Stackelberg Equilibrian in multi-agent reinforcement learning

Abstract: Stackelberg equilibria arise naturally in a range of popular learning problems, such as in security games or indirect mechanism design, and have received in- creasing attention in the reinforcement learning literature. We present a general framework for implementing Stackelberg equilibria search as a multi-agent RL problem, allowing a wide range of algorithmic design choices. We discuss how previous approaches can be seen as specific instantiations of this framework. As a key insight, we note that the design space allows for approaches not previously seen in the literature, for instance by leveraging multitask and meta-RL techniques for follower convergence. We propose one such approach using contextual poli- cies and evaluate it experimentally on standard benchmark domains. Finally, we illustrate the effect of adopting designs outside the borders of our framework in controlled experiments.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)

Supplementary Material: zip

23 Replies

Loading