Finding General Equilibria in Many-Agent Economic Simulations using Deep Reinforcement Learning

Michael Curry; Alexander R Trott; Soham Phade; Yu Bai; Stephan Zheng

Finding General Equilibria in Many-Agent Economic Simulations using Deep Reinforcement Learning

Michael Curry, Alexander R Trott, Soham Phade, Yu Bai, Stephan Zheng

Published: 28 Jan 2022, Last Modified: 22 Jun 2025ICLR 2022 SubmittedReaders: Everyone

Keywords: reinforcement learning, economics, simulation, multi-agent RL, equilibrium

Abstract: Real economies can be seen as a sequential imperfect-information game with many heterogeneous, interacting strategic agents of various agent types, such as consumers, firms, and governments. Dynamic general equilibrium models are common economic tools to model the economic activity, interactions, and outcomes in such systems. However, existing analytical and computational methods struggle to find explicit equilibria when all agents are strategic and interact, while joint learning is unstable and challenging. Amongst others, a key reason is that the actions of one economic agent may change the reward function of another agent, e.g., a consumer's expendable income changes when firms change prices or governments change taxes. We show that multi-agent deep reinforcement learning (RL) can discover stable solutions that are $\epsilon$-Nash equilibria for a meta-game over agent types, in economic simulations with many agents, through the use of structured learning curricula and efficient GPU-only simulation and training.Conceptually, our approach is more flexible and does not need unrealistic assumptions, e.g., market clearing, that are commonly used for analytical tractability. Our GPU implementation enables training and analyzing economies with a large number of agents within reasonable time frames, e.g., training completes within a day. We demonstrate our approach in real-business-cycle models, a representative family of DGE models, with 100 worker-consumers, 10 firms, and a government who taxes and redistributes. We validate the learned meta-game $\epsilon$-Nash equilibria through approximate best-response analyses, show that RL policies align with economic intuitions, and that our approach is constructive, e.g., by explicitly learning a spectrum of meta-game $\epsilon$-Nash equilibria in open economic models.

One-sentence Summary: We present empirical techniques for training an economic simulation with a hierarchy of agent types and large numbers of agents.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/finding-general-equilibria-in-many-agent/code)

20 Replies

Loading