Knowing When to Predict: Confidence-Aware Temporal Reasoning in Large Language Models

Knowing When to Predict: Confidence-Aware Temporal Reasoning in Large Language Models

ACL ARR 2026 January Submission5725 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Temporal Reasoning, Confidence-Aware, Factual Prediction, QA

Abstract: Reasoning about future factual states remains a critical challenge for Large Language Models (LLMs). Current methods typically rely on static knowledge and fail to capture the underlying dynamic patterns of factual change, causing LLMs to exhibit temporal awareness deficits such as Persistence Bias and Change Insensitivity. To address these limitations, we propose Confidence-Aware Temporal Reasoning (CTR), a decision-centric framework that explicitly determines whether and how future factual predictions should be made, by separating future commitment from answer generation based on confidence-aware temporal stability. CTR leverages token-level entropy to weight historical evidence, allowing the model to distinguish between reliable patterns and uncertain noise. We evaluate our approach on an updated version of the FRESHQA benchmark across multiple LLMs. Experimental results demonstrate consistent improvements in future factual reasoning: on GPT-4o, accuracy increases from 44.8% to 53.7% across all facts compared to prompt-based baselines. Moreover, CTR reduces hallucination rates by 14.2% on LLaMA-3.1 and 6.5% on GPT-4o, substantially enhancing the trustworthiness of future-oriented predictions.

Paper Type: Long

Research Area: Question Answering

Research Area Keywords: logical reasoning, open-domain QA, reasoning

Contribution Types: Model analysis & interpretability, Data analysis, Theory

Languages Studied: English

Submission Number: 5725

Loading