ThinkBrake: Mitigating Overthinking in Tool Reasoning

Minjae Oh; Sangjun Song; Seungkyu Lee; Sungmin Jo; Yohan Jo

ThinkBrake: Mitigating Overthinking in Tool Reasoning

Minjae Oh, Sangjun Song, Seungkyu Lee, Sungmin Jo, Yohan Jo

Published: 16 Oct 2025, Last Modified: 10 Nov 2025NeurIPS 2025 ER WorkshopEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Tool usage, Large Language Models, Concise Reasoning

Abstract: Small reasoning models (SRMs) often overthink during tool use: they reach a correct tool-argument configuration, then continue reasoning and overwrite it with an incorrect final call. We diagnose overthinking via oracle rollouts that inject \</think\> at sentence boundaries. On the Berkeley Function Calling Leaderboard (BFCL), this oracle termination lifts average accuracy from 85.8\% to 94.2\% while reducing tokens by 80–94\%, revealing substantial recoverable headroom and potential redundant reasoning. While prior work on concise reasoning has largely targeted mathematics, tool reasoning remains underexplored. We adapt various early-termination baselines to tool use and introduce ThinkBrake, a training-free decoding heuristic. ThinkBrake monitors the log-probability margin between \</think\> and the current top token at sentence boundaries and triggers termination when this margin becomes small. Across BFCL's single turn, non-live and live splits, ThinkBrake preserves or improves accuracy while reducing tokens up to 25\%, outperforming various baselines.

Submission Number: 226

Loading