Beware of the Woozle Effect: Exploring and Mitigating Hallucination Propagation in Multi-Agent Debate

ACL ARR 2025 February Submission7864 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large Language Model-based agents have demonstrated impressive capabilities in various tasks. To further enhance their abilities, the collaboration of multiple agents presents a promising avenue. Recently, Multi-Agent Debate (MAD) was introduced as a typical collaborative method, where agents discuss potential solutions to a problem over several rounds of debate. However, researchers observed that MAD is not stably superior to single-agent methods. Unfortunately, there has been insufficient exploration of this issue. In this paper, we experimentally find out what leads to the instability of MAD, namely the woozle effect, which refers to the propagation of hallucinations among agents in the debate. Since MAD is always based on a static and fully connected communication topology, each agent can be misled by others that containing erroneous information, and subsequently spread this misinformation. To address this, we propose DIGRA, a novel MAD framework with dynamic communication topology driven by the information gain ratio. Our evaluations across various benchmarks show that selecting appropriate counterparts for debates significantly mitigates hallucination propagation, promotes critical thinking and collaboration, ultimately leading to superior collective intelligence.
Paper Type: Long
Research Area: Language Modeling
Research Area Keywords: applications, robustness
Contribution Types: Model analysis & interpretability, NLP engineering experiment
Languages Studied: English
Submission Number: 7864
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview