SCoOP: Semantic Consistent Opinion Pooling for Uncertainty Quantification in Multiple Vision Language Model Systems

Published: 02 Mar 2026, Last Modified: 30 Mar 2026Agentic AI in the Wild: From Hallucinations to Reliable Autonomy PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Vision-Language Models, Multi-Model Systems, Uncertainty Quantification, Hallucination
TL;DR: A uncertainty quantification framework for hallucination detection in multi-VLM systems
Abstract: Combining multiple Vision–Language Models (VLMs) can enhance multimodal reasoning and robustness, but aggregating heterogeneous models' outputs amplifies uncertainty and increases the risk of hallucinations. We propose \textbf{SCoOP} (\emph{Semantic-Consistent Opinion Pooling}), a \emph{training-free} uncertainty quantification (UQ) framework for multi-VLM systems through uncertainty-weighted linear opinion pooling. The core idea is to treat each VLM as a probabilistic ``expert," sample multiple outputs, map them to a unified space, aggregate their opinions, and produce a system-level uncertainty score. Unlike prior UQ methods designed for single models, SCoOP explicitly measures collective, system-level uncertainty across multiple VLMs, enabling effective hallucination detection and abstention for highly uncertain samples. On ScienceQA, SCoOP achieves an \textbf{AUROC of 0.866} for hallucination detection, outperforming baselines (\textbf{0.732-0.757}) by approximately \textbf{10-13\%}. For abstention, it attains an \textbf{AURAC of 0.907}, exceeding baselines (\textbf{0.818-0.840}) by \textbf{7-9\%}. Despite these gains, SCoOP introduces only microsecond-level aggregation overhead relative to the baselines, which is trivial compared to typical VLM inference time (on the order of seconds). These results demonstrate that SCoOP provides an efficient and principled mechanism for uncertainty-aware aggregation, advancing the reliability of multimodal AI systems. Our code is publicly available at \href{https://github.com/chungenyu6/SCoOP}{https://github.com/chungenyu6/SCoOP}.
Submission Number: 60
Loading