CG-MAV: Confidence-Guided Multi-Agent Verification for LLM Reasoning

ACL ARR 2026 January Submission10802 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large language model, Agent, Reasoning
Abstract: Large Language Models have shown great potential in reasoning tasks through test-time scaling methods like generating multiple candidate solutions for a given task. However, reliably selecting the correct answer from these candidates remains challenging. Existing Self-certainty-based selection methods are effective on easy tasks but become unreliable on hard ones. We propose Confidence-Guided Multi-Agent Verification (CG-MAV) that uses confidence to distinguish easy and hard tasks and applies different strategies accordingly. CG-MAV leverages Self-certainty-based Borda voting not only as a selection signal but also as an indicator of task difficulty, enabling a classification of tasks into easy and hard categories. Easy tasks are handled through direct selection, while hard tasks are processed through multi-agent verification. Each verification agent is assigned a clear and specific persona, focusing on a distinct aspect of solution correctness. Extensive experiments on two reasoning datasets across multiple models demonstrate the superiority and generalization of our proposed CG-MAV.
Paper Type: Long
Research Area: AI/LLM Agents
Research Area Keywords: Multi-agent systems, LLM agents
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data analysis, Theory
Languages Studied: English
Submission Number: 10802
Loading