Keywords: Efficient Reasoning, Conformal Prediction
Abstract: While LLMs have seen substantial improvement in reasoning capabilities, they also sometimes overthink, generating unnecessary reasoning steps, particularly under uncertainty given ill-posed or ambiguous queries. We introduce statistically principled early stopping methods that monitor uncertainty signals during generation to mitigate this issue. Our first approach is nonparametric and provides finite-sample guarantees on the probability of halting too early on well-posed queries. Our second approach is parametric: it models inter-arrival times of uncertainty keywords as a renewal process and applies sequential testing for stopping.
We conduct empirical evaluations on reasoning tasks across several domains and models. Our results indicate that uncertainty-aware early stopping can improve both efficiency and reliability in LLM reasoning. The performance varies across domains, and we observe especially significant gains for math reasoning.
Submission Number: 137
Loading