Probing Confidence Regions for Early Exits in Chain-of-Thought

Probing Confidence Regions for Early Exits in Chain-of-Thought

ICLR 2026 Conference Submission15096 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: entropy, chain-of-thought, early exit

TL;DR: This paper pioneers the analysis of global entropy dynamics in CoT and identifies a confidence region indicating answer convergence and redundancy. Then we propose COnfidence Region Exits (CORE) to halt CoT generation upon entering confidence region.

Abstract: Chain-of-Thought (CoT) has demonstrated remarkable problem-solving capabilities in many large language models (LLMs), but their reasoning processes often exhibit substantial redundancy. To mitigate these issues, various approaches have been explored to improve reasoning efficiency. In this paper, we focus on early exit methods which are lightweight and can be seamlessly adapted to various methods. These methods typically trigger an early exit based on different types of signals, including localized criteria like single-step confidence scores, as well as stabilization of an intermediate trial answer over multiple steps. However, we observe that they often struggle to confirm whether the underlying reasoning process is complete or sound. This paper pioneers to diagnose the reasoning state of CoT through the global dynamics of its entropy. We reveal a consistent pattern: CoT generation begins in a high-entropy uncertainty region before transitioning to a stable, low-entropy confidence region. We demonstrate that the transition into this confidence region strongly correlates with a complete reasoning process. Based on this insight, we propose COnfidence Region Exits (CORE) to stop the CoT when the model enters the confidence region. Experiments demonstrate that our approach achieves a superior trade-off between computational cost and accuracy among early exit methods across various models including Deepseek-R1-Distill-Qwen-7B, Qwen3-4B-Thinking-2507, and Qwen3-14B in AIME24, AIME25, and GPQA datasets. We believe that this method can serve as a strong efficient reasoning method and provide insights for understanding CoT.

Primary Area: generative models

Submission Number: 15096

Loading