ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks

ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks

ACL ARR 2025 May Submission6281 Authors

20 May 2025 (modified: 29 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Multi-agent systems have emerged as a promising approach for enhancing the reasoning capabilities of large language models in complex problem-solving. However, current MAS frameworks are limited by poor flexibility and scalability, with underdeveloped optimization strategies. To address these challenges, we propose ReSo, which integrates task graph generation with a reward-driven two-stage agent selection process. The core of ReSo is the proposed Collaborative Reward Model, which can provide fine-grained reward signals for MAS cooperation for optimization. We also introduce an automated data synthesis framework for generating MAS benchmarks, without human annotations. Experimentally, ReSo matches or outperforms existing methods. ReSo achieves $\textbf{33.7\%}$ and $\textbf{32.3\%}$ accuracy on Math-MAS and SciBench-MAS SciBench, while other methods completely fail.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: Large Language Model, Multi-agent System, LLM-based agent

Contribution Types: NLP engineering experiment, Data resources

Languages Studied: English

Submission Number: 6281

Loading