Learning to Steer Markovian Agents under Model Uncertainty

Jiawei Huang; Vinzenz Thoma; Zebang Shen; Heinrich H. Nax; Niao He

Learning to Steer Markovian Agents under Model Uncertainty

Jiawei Huang, Vinzenz Thoma, Zebang Shen, Heinrich H. Nax, Niao He

Published: 01 Aug 2024, Last Modified: 09 Oct 2024EWRL17EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Markov Games, Steering Learning Dynamics, Mechanism Design, Non-Episodic Reinforcement Learning

TL;DR: We study the steering problem under model uncertainty. We propose a formal problem formulation and a new optimization objective with theoretical justification. We contribute theoretical understanding and empirical solutions to our objective.

Abstract: We study reward design for steering multi-agent systems towards desired equilibria \emph{without} prior knowledge of the agents' underlying policy learning dynamics model. We introduce a model-based non-episodic Reinforcement Learning (RL) formulation for our steering problem. To handle the model uncertainty, we contribute a new optimization objective targeting at learning a \emph{history-dependent} steering strategy, and establish guarantees on its optimal solution. Theoretically, we identify conditions for the existence of steering strategies to guide agents sufficiently close to the desired policies. Complementing our theoretical contributions, we provide empirically tractable algorithms to approximately solve our objective, which effectively tackles the challenge in efficiently learning history-dependent strategies. We demonstrate the efficacy of our algorithms through empirical evaluations.

Supplementary Material: zip

Submission Number: 2

Loading