Abstract: Large language models (LLMs) have demonstrated remarkable capabilities in text generation tasks. However, they are prone to generate hallucinations, where responses lack supporting evidence from the source text. Despite various mitigation approaches, none have fully eradicated hallucinations. Therefore, accurately detecting and alerting users to these hallucinations remains crucial in practice. In this paper, we propose a hierarchical framework for detecting ungrounded hallucinations, utilizing a Multi-Agent Debate-Driven approach (MAD-HD). This approach involves multiple agents engaging in several rounds of debate, with each agent updating its judgments and rationales in each round based on the evolving results of others, ultimately reaching a consensus through voting. We evaluated our method on four public datasets, achieving an average F1 score improvement of approximately 2.6% compared to state-of-the-art methods.
External IDs:dblp:conf/nlpcc/GaoBL25
Loading