Bridging the Capability Gap: Joint Alignment Tuning for Harmonizing LLM-based Multi-Agent Systems

Bridging the Capability Gap: Joint Alignment Tuning for Harmonizing LLM-based Multi-Agent Systems

ACL ARR 2025 February Submission6841 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: The advancement of large language models (LLMs) has spurred the development of multi-agent systems for complex tasks, yet existing approaches often train agents independently, leading to capability gaps and coordination failures. To address this, we propose MOAT, a Multi-Agent Joint Alignment Tuning framework that bridges the capability gap between planning and grounding agents through iterative joint alignment. MOAT alternates between two key phases: (1) Planning Agent Alignment, which optimizes subgoal generation by rewarding sequences that reduce grounding perplexity, and (2) Grounding Agent Improving, which enhances action generation using high-quality subgoal-action pairs filtered by a critic model. Theoretical analysis proves that MOAT ensures non-decreasing performance and convergence. Experiments across six benchmarks demonstrate that MOAT outperforms state-of-the-art baselines, achieving average improvements of 3.1\% on held-in tasks and 4.4\% on held-out tasks with 7B-scale models. Notably, MOAT surpasses GPT-4 on Mind2Web by over 50\%, showcasing its ability to harmonize smaller open-source LLMs into a competitive multi-agent system.

Paper Type: Long

Research Area: Language Modeling

Research Area Keywords: applications; fine-tuning

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: English

Submission Number: 6841

Loading