Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers

Published: 08 Jul 2025, Last Modified: 26 Aug 2025COLM 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: large language models, test-time compute, verification, scaling
TL;DR: We explore scaling the number of verifier models as a novel test-time scaling dimension for improving language model performance and introduce an algorithm that enables simple scaling along this dimension.
Abstract: By utilizing more computational resources at test-time, large language models (LLMs) can improve without additional training. One common strategy uses *verifiers* to evaluate candidate outputs. In this work, we propose a novel scaling dimension for test-time compute: *scaling the number of verifier models*. We introduce Multi-Agent Verification (MAV) as a test-time compute paradigm that combines multiple verifiers to improve performance. To investigate scaling up the verification compute, we propose to combine multiple Aspect Verifiers (AVs) --- off-the-shelf LLMs prompted to verify different aspects of outputs. AVs are a convenient building block for MAV since they can be easily combined without any additional training. We introduce BoN-MAV as a simple multi-agent verification algorithm that combines best-of-*n* sampling with aspect verifiers, and we show that performance improves as we spend more verification compute at test-time by increasing the number and type of verifiers. Moreover, we demonstrate both weak-to-strong generalization, where combining weak verifiers improves even stronger LLMs, and self-improvement, where the same base model is used to both generate and verify outputs. Our results establish scaling the number and type of verifier models as a promising new dimension for improving language model performance at test time.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html
Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html
Submission Number: 1666
Loading