Keywords: No-Regret Learning, Inverse Game Theory, Revealed Preference, Steering
TL;DR: We propose the first algorithm to learn a game from the actions of any no-regret learning agents by paying the agents.
Abstract: We study the problem of learning the utility functions of no-regret learning agents in a repeated normal-form game.
Differing from most prior literature, we introduce a principal with the power to observe the agents playing the game, send agents signals, and give agents *payments* as a function of their actions.
We show that the principal can, using a number of rounds polynomial in the size of the game, learn the utility functions of all agents to any desired precision $\varepsilon > 0$, for any no-regret learning algorithms of the agents.
Our main technique is to formulate a zero-sum game between the principal and the agents, where the principal's strategy space is the set of all payment functions.
Finally, we discuss implications for the problem of *steering* agents to a desired equilibrium: in particular, we introduce, using our utility-learning algorithm as a subroutine, the first algorithm for steering arbitrary no-regret learning agents without prior knowledge of their utilities.
Primary Area: learning theory
Submission Number: 9913
Loading