Reframing Multi-Agent Reinforcement Learning with Variational Inequalities

Published: 01 Jul 2025, Last Modified: 23 Jul 2025Finding the Frame (RLC 2025)EveryoneRevisionsBibTeXCC BY 4.0
Keywords: multi-agent reinforcement learning, variational inequalities
Abstract: Multi-Agent Reinforcement Learning (MARL) has become a versatile tool for tackling complex tasks, as agents learn to cooperate and compete across a wide range of applications. Yet, instability remains a persistent hurdle. We pinpoint one key source of instability: the *rotational* dynamics that naturally arise when agents optimize conflicting objectives---dynamics that standard gradient methods struggle to tame. We reframe MARL approaches using Variational Inequalities (VIs), offering a unified framework to address such issues. Leveraging optimization techniques designed for VIs, we propose a general approach for integrating gradient-based VI methods capable of handling rotational dynamics into existing MARL algorithms. Empirical results demonstrate significant performance improvements across benchmarks. In zero-sum games, *Rock--paper--scissors* and *Matching pennies*, VI methods achieve better convergence to equilibrium strategies, and in the *Multi-Agent Particle Environment: Predator-prey*, they also enhance team coordination. These results underscore the transformative potential of advanced optimization techniques in MARL.
Submission Number: 34
Loading