Enhancing Multi-LLM Debate with Uncertainty-Aware Selection for Improved Reasoning.

Enhancing Multi-LLM Debate with Uncertainty-Aware Selection for Improved Reasoning.

ACL ARR 2025 February Submission6152 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large Language Models (LLMs) have shown exceptional reasoning capabilities, yet selecting the most reliable response from multiple LLMs remains a challenge, especially in resource-constrained settings. Existing approaches often rely on expensive external verifiers, human evaluators, or self-consistency techniques that require multiple samples from a single model. Multi-LLM debate provides a more interactive mechanism, yet it frequently underperforms compared to self-consistency with the best LLM. In this work, we introduce a log-likelihood-based selection framework to enhance reasoning in multi-LLM debate settings. Our approach leverages uncertainty estimation to identify the most confident response while minimizing inference costs. We demonstrate that our method outperforms majority vote selection and surpasses self-consistency performance for a large number of model calls. Through extensive experiments, we show that multi-LLM collaboration—when guided by uncertainty-aware selection—lead to improvement of 6.19\% for settings with less number of model calls.

Paper Type: Short

Research Area: Language Modeling

Research Area Keywords: Multi-LLM Debate, Uncertainty, Log-Likelihood estimation.

Contribution Types: Model analysis & interpretability, Approaches low compute settings-efficiency

Languages Studied: N/A

Submission Number: 6152

Loading