Concise Reasoning in the Lens of Lagrangian Optimization

Concise Reasoning in the Lens of Lagrangian Optimization

ICLR 2026 Conference Submission19020 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: large language model, reasoning, concise reasoning

Abstract: Concise reasoning in large language models seeks to generate only essential intermediate steps needed to arrive at a final answer, thereby alleviating issues of overthinking. Most proposed approaches hinge on carefully hand-crafted heuristics, struggling to balance concision with performance, often failing to adapt across domains and model scales. In this work, we address these challenges by introducing a principled and pragmatic strategy, performance-aware length updating (PALU). As a principled algorithm, PALU formulates concise reasoning as a constrained optimization problem, minimizing response length subject to a performance constraint, and then applies *Lagrangian* optimization to convert it into a tractable unconstrained problem. As a pragmatic solution, PALU streamlines complicated update rules through three approximations: *(i)* estimating performance with off-policy rollouts, *(ii)* truncating the *Lagrange* multiplier to two extremes, and *(iii)* replacing gradient-based updates with quantile-driven length adjustments. PALU reduces output length by 65\% while improving accuracy by 15\% when applied to *DeepSeek-Distill-Qwen-1.5B*, averaged over five benchmarks, outperforming a range of alternative methods. Furthermore, PALU is demonstrated to adapt across both domain (logic, STEM and math) and model scale (1.5B, 7B, 14B) entrenching the algorithm as a practical and effective concise reasoning approach.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 19020

Loading