Learning Deviation Payoffs in Simulation-Based Games

Samuel Sokota, Caleb Ho, Bryce Wiedenbeck

25 Sept 2019 (modified: 17 Feb 2021)OpenReview Archive Direct UploadReaders: Everyone

Abstract: We present a novel approach for identifying approximate role-symmetric Nash equilibria in large simulation-based games. Our method uses neural networks to learn a mapping from mixed-strategy profiles to deviation payoffs—the expected values of playing pure-strategy deviations from those profiles. This learning can generalize from data about a tiny fraction of a game’s outcomes, permitting tractable analysis of exponentially large normal-form games. We give a procedure for iteratively refining the learned model with new data produced by sampling in the neighborhood of each candidate Nash equilibrium. Relative to the existing state of the art, deviation payoff learning dramatically simplifies the task of computing equilibria and more effectively addresses player asymmetries. We demonstrate empirically that deviation pay- off learning identifies better approximate equilibria than previous methods and can handle more difficult settings, including games with many more players, strategies, and roles.

0 Replies