Keywords: entropy, chain-of-thought, early exit
TL;DR: This paper pioneers the analysis of global entropy dynamics in CoT and identifies a confidence region indicating answer convergence and redundancy. Then we propose COnfidence Region Exits (CORE) to halt CoT generation upon entering confidence region.
Abstract: Chain-of-Thought (CoT) has demonstrated remarkable problem-solving capabilities in many large language models (LLMs), but their reasoning processes often exhibit substantial redundancy. To mitigate these issues, various approaches have been explored to improve reasoning efficiency. In this paper, we focus on early exit methods which are lightweight and can be seamlessly adapted to various methods. These methods typically trigger an early exit based on different types of signals, including localized criteria like single-step confidence scores, as well as stabilization of an intermediate trial answer over multiple steps. However, we observe that they often struggle to confirm whether the underlying reasoning process is complete or sound. This paper pioneers to diagnose the reasoning state of CoT
through the global dynamics of its entropy. We reveal a consistent pattern: CoT generation begins in a high-entropy uncertainty region before transitioning to a stable, low-entropy confidence region. We demonstrate that the transition into this confidence region strongly correlates with a complete reasoning process. Based on this insight, we propose COnfidence Region Exits (CORE) to stop the CoT
when the model enters the confidence region. Experiments demonstrate that our approach achieves a superior trade-off between computational cost and accuracy among early exit methods across various models including Deepseek-R1-Distill-Qwen-7B, Qwen3-4B-Thinking-2507, and Qwen3-14B in AIME24, AIME25, and GPQA datasets. We believe that this method can serve as a strong efficient
reasoning method and provide insights for understanding CoT.
Primary Area: generative models
Submission Number: 15096
Loading