ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks

ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks

ACL ARR 2025 February Submission8002 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Multi-agent systems have emerged as a promising approach for enhancing the reasoning capabilities of large language models in complex problem-solving. However, current MAS frameworks are limited by poor flexibility and scalability, with underdeveloped optimization strategies. To address these challenges, we propose ReSo, which integrates task graph generation with a reward-driven two-stage agent selection process. The core of ReSo is the proposed Collaborative Reward Model, which can provide fine-grained reward signals for MAS cooperation, enabling dynamic performance optimization. We also introduce an automated data synthesis framework for generating complex MAS benchmarks, eliminating the need for human annotations. Experimentally, ReSo matches or outperforms existing methods on Math and SciBench, and stands out on SciBench-MAS and Math-MAS, the proposed complex collaboration benchmark.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: Large Language Model, Multi-agent System, LLM-based agent

Languages Studied: English

Submission Number: 8002

Loading