From Debate to Equilibrium: Belief‑Driven Multi‑Agent LLM Reasoning via Bayesian Nash Equilibrium

Xie Yi; Zhanke Zhou; Chentao Cao; Qiyu Niu; Tongliang Liu; Bo Han

From Debate to Equilibrium: Belief‑Driven Multi‑Agent LLM Reasoning via Bayesian Nash Equilibrium

Xie Yi, Zhanke Zhou, Chentao Cao, Qiyu Niu, Tongliang Liu, Bo Han

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Multi-agent frameworks can substantially boost the reasoning power of large language models (LLMs), but they typically incur heavy computational costs and lack convergence guarantees. To overcome these challenges, we recast multi-LLM coordination as an incomplete-information game and seek a Bayesian Nash equilibrium (BNE), in which each agent optimally responds to its probabilistic beliefs about the strategies of others. We introduce Efficient Coordination via Nash Equilibrium (ECON), a hierarchical reinforcement-learning paradigm that marries distributed reasoning with centralized final output. Under ECON, each LLM independently selects responses that maximize its expected reward, conditioned on its beliefs about co-agents, without requiring costly inter-agent exchanges. We mathematically prove that ECON attains a markedly tighter regret bound than non-equilibrium multi-agent schemes. Empirically, ECON outperforms existing multi-LLM approaches by 11.2% on average across six benchmarks spanning complex reasoning and planning tasks. Further experiments demonstrate ECON’s ability to flexibly incorporate additional models, confirming its scalability and paving the way toward larger, more powerful multi-LLM ensembles. The code is publicly available at: https://github.com/tmlr-group/ECON.

Lay Summary: (1) Problem: When multiple large language models (LLMs) work together as agents, they can solve complex problems better than working alone. However, current approaches require these agents to constantly exchange messages with each other, which is expensive and slow. This heavy communication also limits how many LLM agents can work together effectively. (2) Solution: We developed ECON, a new coordination method that eliminates the need for direct communication between LLM agents. Instead of sending messages back and forth, each agent maintains beliefs about what the others are likely to do and makes decisions based on these beliefs. We used game theory principles to ensure all agents reach an optimal collaborative strategy without talking to each other. (3) Impact: ECON outperformed existing multi-agent approaches by 11.2% on average across six challenging reasoning tasks while using 21.4% fewer computational resources. Our method can scale to coordinate up to nine LLM agents effectively, opening the door to building larger, more powerful agent teams that can tackle increasingly complex problems.

Link To Code: https://github.com/tmlr-group/ECON

Primary Area: Deep Learning->Large Language Models

Keywords: Large Language Models, Reasoning, Multiagent Reasoning

Submission Number: 4882

Loading