Assessing Large Language Models in Updating Their Forecasts with New Information

Assessing Large Language Models in Updating Their Forecasts with New Information

ICLR 2026 Conference Submission13175 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: future event prediction, forecasting, model probing

Abstract: Prior work has largely treated future event prediction as a static task, failing to consider how forecasts and the confidence in them should evolve as new evidence emerges. To address this gap, we introduce EvolveCast, a framework for evaluating whether large language models appropriately revise their predictions in response to new information. In particular, EvolveCast assesses whether LLMs adjust their forecasts when presented with information released after their training cutoff. We use human forecasters as a comparative reference to analyze prediction shifts and confidence calibration under updated contexts. While LLMs demonstrate some responsiveness to new information, their updates are often inconsistent or overly conservative, revealing limitations in their ability to incorporate it in their predictions. Across settings, models tend to express conservative bias, underscoring the need for more robust approaches to belief updating

Supplementary Material: zip

Primary Area: other topics in machine learning (i.e., none of the above)

Submission Number: 13175

Loading