SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning

SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning

ICLR 2026 Conference Submission15583 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Efficient Reasoning, LLM, Overthinking

Abstract: While reasoning models demonstrate exceptional performance on complex tasks, they often exhibit tendencies of overthinking on simple problems. This phenomenon not only leads to excessive computational resource consumption but also significantly degrades user experience. To address this challenge, we propose SelfBudgeter - a novel user-friendly adaptive controllable reasoning framework that incorporates a budget estimation mechanism prior to reasoning. The framework adopts a dual-phase training paradigm: during the cold-start phase, the model learns to predict token budgets before executing reasoning in a standardized format; in the reinforcement learning phase, the model is trained to autonomously plan budgets based on problem difficulty and strictly adhere to them when generating responses. Since the model outputs budget estimates at the initial stage, users can immediately anticipate waiting duration, enabling flexible decisions on whether to interrupt or continue the generation process. Notably, our method supports manual control of reasoning length through pre-filled budget fields. Experimental results demonstrate that SelfBudgeter can dynamically allocate budgets according to problem complexity, yielding an average response length compression of 61% for the 1.5B model on GSM8K, MATH500, and AIME2025, and 48% for the 7B model, while maintaining nearly undiminished accuracy.

Supplementary Material: zip

Primary Area: generative models

Submission Number: 15583

Loading