Abstract: Multi-agent systems have emerged as a promising approach for enhancing the reasoning capabilities of large language models in complex problem-solving. However, current MAS frameworks are limited by poor flexibility and scalability, with underdeveloped optimization strategies. To address these challenges, we propose ReSo, which integrates task graph generation with a reward-driven two-stage agent selection process. The core of ReSo is the proposed Collaborative Reward Model, which can provide fine-grained reward signals for MAS cooperation, enabling dynamic performance optimization. We also introduce an automated data synthesis framework for generating complex MAS benchmarks, eliminating the need for human annotations. Experimentally, ReSo matches or outperforms existing methods on Math and SciBench, and stands out on SciBench-MAS and Math-MAS, the proposed complex collaboration benchmark.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: Large Language Model, Multi-agent System, LLM-based agent
Languages Studied: English
Submission Number: 8002
Loading