Explore Reinforced: Equilibrium Approximation with Reinforcement Learning

Mateusz Nowak, Qintong Xie, Emma Graham, Ryan Yu, Michelle Yilin Feng, Roy Leibovitz, Xavier Cadet, Peter Chin

Published: 01 Jan 2024, Last Modified: 06 Nov 2025GameSec (1) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Current approximate Coarse Correlated Equilibria (CCE) algorithms struggle with equilibrium approximation for games in large stochastic environments. While these game-theoretic methods are theoretically guaranteed to converge to a strong solution concept, reinforcement learning (RL) algorithms have shown increasing capability in such environments but lack the equilibrium guarantees provided by game-theoretic approaches. In this paper, we introduce Exp3-IXRL - an equilibrium approximator that utilizes RL, specifically leveraging the agent’s action selection, to update equilibrium approximations while preserving the integrity of both learning processes. We therefore extend the Exp3 algorithms beyond the stateless, non-stochastic settings. Empirically, we demonstrate improved performance in classic non-stochastic multi-armed bandit settings, capability in stochastic multi-armed bandits, and strong results in a complex and adversarial cybersecurity network environment.

External IDs:dblp:conf/gamesec/NowakXGYFLCC24