Language as a Treatment: Causal Estimation of Homogeneity in Multi-Agent Systems

ACL ARR 2026 January Submission2449 Authors

03 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Homogeneity, Language, Multi-agent debate, randomized controlled trial
Abstract: Multi-agent debate is increasingly used to improve large language model (LLM) reasoning and to support alignment-oriented judgments, yet these systems risk collapsing into homogeneous arguments (“groupthink”). Existing evaluations often conflate prompt language with topic selection and other pipeline variations, making it difficult to attribute homogeneity to specific design factors. We address this problem with a pre-registered, design-based randomized controlled trial that isolates the causal effects of (i) topic-selection policy and (ii) language conditioning on debate homogeneity. Our two-stage randomization first samples a policy domain and then a motion from a bilingual WUDC 2023–2025 motion pool; for each motion, we run paired Chinese and English debate sessions with yoked model-to-role assignments, randomized language order, and strict context resets. We operationalize homogeneity using a multilingual Homogeneity Index that combines lexical similarity (generalized Jensen–Shannon divergence under a shared tokenizer) and semantic similarity (embedding-based cosine aggregation), with anchored standardization to enable cross-language comparability. Across 99 paired motion draws, switching to Chinese causes a large increase in homogeneity (ATE = 0.499, 95\% CI [0.442, 0.556], Fisher p < 0.001), substantially larger than domain-level differences, which are statistically subtle after multiple-comparison control. These findings identify language conditioning as a dominant driver of convergence in multi-agent debates and motivate multilingual-aware evaluation and mitigation for debate-based systems.
Paper Type: Long
Research Area: Safety and Alignment in LLMs
Research Area Keywords: causality, LLM/AI agents, cross-lingual
Languages Studied: English, Chinese
Submission Number: 2449
Loading