Assessing the Belief Consistency of Large Language Models on the Logical Conversation Process

Assessing the Belief Consistency of Large Language Models on the Logical Conversation Process

ACL ARR 2026 January Submission9471 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Consistency, large language model, interpretability

Abstract: Interpreting reasoning methods that operate within the context of LLMs, such as chain-of-thought, crucially depends on the transparency of whether the model actually follows the intended reasoning process. We focus on whether the beliefs held by a model remain consistent before and after the extension of the context. Previous research on consistency evaluation typically uses data with correct answers, which is problematic as the inability to arrive at the correct answer in the first place makes it unsuitable for assessing consistency. Furthermore, evaluating cases where inconsistency stems from multiple errors poses difficulties. We propose a new evaluation method to assess the consistency of LLMs in a multiple-choice question answering format, designed so that any option chosen is correct, allowing for the evaluation of the proposed belief consistency. It also supports isolation of errors such as reasoning failures and biases. We reveal that the belief consistency does not improve by model size scaling alone, whereas continual pre-training on coding and mathematics text improves it. Furthermore, models trained on code and mathematics text show a seemingly contradictory result of increased logical failures, indicating that belief consistency and superficial consistency are not necessarily directly linked.

Paper Type: Long

Research Area: Special Theme (conference specific)

Research Area Keywords: Explainability of NLP Models,Ethics, Bias, and Fairness,Generation,Interpretability and Analysis of Models for NLP,Language Modeling,Question Answering,Resources and Evaluation

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: English

Submission Number: 9471

Loading