CCAGENT: Learning Constructive Consensus for Multi-Agent LLMs in Real-World Environments

CCAGENT: Learning Constructive Consensus for Multi-Agent LLMs in Real-World Environments

ACL ARR 2025 May Submission6518 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Real-world decision-making often involves complex deliberation among diverse stakeholders with conflicting values. However, existing LLM-based multi-agent frameworks struggle with two key challenges: (1) they lack real-world grounding, relying on synthetic tasks that fail to capture the complexity of real-world decision making, and (2) they are difficult to supervise effectively, since desirable behaviors like principled compromise, quality discussion, and open-mindedness are abstract and hard to quantify. We address both challenges with CCAgent, a framework for training deliberative agents using contrastive supervision over natural language rationales and counterfactuals. First, we introduce two decision-making datasets grounded in real-world sources: city planning stakeholder interviews and U.S. Senator interviews and voting patterns. Second, we propose nine training objectives that reinforces socially aligned behaviors—such as consensus, compromise, and low dogmatism—without requiring scalar rewards or human preference labels. We also propose eight strategies for efficient multi-agent debate. Lastly, we introduce CCAgent, a few-shot lightweight, automatic Direct Preference Optimization (DPO) method for efficient multi-agent debate. CCAGENT outperforms baselines achieving faster consensus with high quality discussions between agents. Our results demonstrate that DPO enables principled deliberation even in complex, disagreement-rich domains.

Paper Type: Long

Research Area: Language Modeling

Research Area Keywords: LLM/AI agents, applications

Contribution Types: Model analysis & interpretability, Data resources, Data analysis

Languages Studied: English

Submission Number: 6518

Loading