Learning to Steer Markovian Agents under Model Uncertainty

Published: 01 Aug 2024, Last Modified: 09 Oct 2024EWRL17EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Markov Games, Steering Learning Dynamics, Mechanism Design, Non-Episodic Reinforcement Learning
TL;DR: We study the steering problem under model uncertainty. We propose a formal problem formulation and a new optimization objective with theoretical justification. We contribute theoretical understanding and empirical solutions to our objective.
Abstract: We study reward design for steering multi-agent systems towards desired equilibria \emph{without} prior knowledge of the agents' underlying policy learning dynamics model. We introduce a model-based non-episodic Reinforcement Learning (RL) formulation for our steering problem. To handle the model uncertainty, we contribute a new optimization objective targeting at learning a \emph{history-dependent} steering strategy, and establish guarantees on its optimal solution. Theoretically, we identify conditions for the existence of steering strategies to guide agents sufficiently close to the desired policies. Complementing our theoretical contributions, we provide empirically tractable algorithms to approximately solve our objective, which effectively tackles the challenge in efficiently learning history-dependent strategies. We demonstrate the efficacy of our algorithms through empirical evaluations.
Supplementary Material: zip
Submission Number: 2
Loading