Exploring Geometric Concentration for Quantifying Uncertainty in Scientific Image Caption Generation

Souradeep Chattopadhyay; Brendan Kennedy; Sai Munikoti; Karl Pazdernik; Soumik Sarkar

Exploring Geometric Concentration for Quantifying Uncertainty in Scientific Image Caption Generation

Souradeep Chattopadhyay, Brendan Kennedy, Sai Munikoti, Karl Pazdernik, Soumik Sarkar

Published: 13 Apr 2026, Last Modified: 13 Apr 2026Calibration for Modern AI @ AISTATS 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Uncertainty Quantification, Large Language Models, Sampling Based Methods

Abstract: Uncertainty Quantification (UQ) methods for Large Language Models (LLMs) have primarily been evaluated on question-answering benchmarks, where outputs are short, structured, and comparisons between generations are relatively well-defined. In contrast, many practical generative tasks involve open-ended, complex outputs, motivating evaluation of current state-of-the-art UQ beyond simple question-answering settings. In this work, we explore the challenging task of UQ for scientific image captioning. Using a subset of the ArxivCap dataset and two popular multimodal LLMs, we compare \emph{Directional Concentration Uncertainty} (DCU), a geometric UQ measure proposed by \citet{dcu_2026}, against semantic entropy \citep{kuhnetal23}, a leading method for UQ on structured question-answering. Our results indicate that DCU clearly outperforms SE, motivating further research into applications of DCU to other, complex tasks.

Submission Number: 20

Loading