Latent Veracity Inference for Identifying Errors in Stepwise Reasoning

Minsu Kim; Jean-Pierre R. Falet; Oliver Ethan Richardson; Xiaoyin Chen; Moksh Jain; Sungjin Ahn; Sungsoo Ahn; Yoshua Bengio

Latent Veracity Inference for Identifying Errors in Stepwise Reasoning

Minsu Kim, Jean-Pierre R. Falet, Oliver Ethan Richardson, Xiaoyin Chen, Moksh Jain, Sungjin Ahn, Sungsoo Ahn, Yoshua Bengio

Published: 26 Jan 2026, Last Modified: 11 Apr 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Latent variable models, Language models, Probabilistic inference, Veracity, Chain-of-thought

TL;DR: We propose a search-based latent-variable inference method for identifying errors in the reasoning chains of language models.

Abstract: Chain-of-Thought (CoT) reasoning has advanced the capabilities and transparency of language models (LMs); however, reasoning chains can contain inaccurate statements that reduce performance and trustworthiness. To address this, we propose to augment each reasoning step in a CoT with a latent veracity (or correctness) variable. To efficiently explore this expanded space, we introduce Veracity Search (VS), a discrete search algorithm over veracity assignments. It performs otherwise intractable inference in the posterior distribution over latent veracity values by leveraging the LM's joint likelihood over veracity and the final answer as a proxy reward. This efficient inference-time verification method facilitates supervised fine-tuning of an Amortized Veracity Inference (AVI) machine by providing pseudo-labels for veracity. AVI generalizes VS, enabling accurate zero-shot veracity inference in novel contexts. Empirical results demonstrate that VS reliably identifies errors in logical (ProntoQA), mathematical (GSM8K), and commonsense (CommonsenseQA) reasoning benchmarks, with AVI achieving comparable zero-shot accuracy. Finally, we demonstrate the utility of latent veracity inference for providing feedback during self-correction and self-improvement.

Primary Area: probabilistic methods (Bayesian methods, variational inference, sampling, UQ, etc.)

Submission Number: 14029

Loading