Think Earlier, Not Longer: Prompt Optimization via Reducing Unhealthy Exploration

ACL ARR 2026 January Submission8563 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: reasoning efficiency, prompt optimization, large language models, entropy dynamics
Abstract: While large language models exhibit strong reasoning capabilities, prior work shows that their performance can be further enhanced by encouraging greater exploration. However, existing approaches overlook the presence of unhealthy exploration that increases exploration-related token usage without contributing to effective problem-solving. In this work, we show that prompt ambiguity can artificially prolong early-stage exploration, manifested as an elevated and delayed early-stage entropy peak. Although this uncertainty may be gradually resolved as reasoning progresses, reflected in the eventual convergence of the late-stage entropy peak, it does not meaningfully improve accuracy or self-consistency and instead substantially reduces reasoning efficiency. Motivated by these observations, we propose an entropy-dynamics-aware prompt optimization framework that trains a lightweight optimizer to generate concise clarifications. These clarifications aim to reduce ambiguity-induced early-stage uncertainty while preserving the model’s reasoning capabilities. Extensive experiments across multiple models, reasoning budgets, and benchmarks demonstrate that our approach consistently improves reasoning efficiency by up to 52\%, by reducing unhealthy exploration without sacrificing accuracy.
Paper Type: Long
Research Area: LLM Efficiency
Research Area Keywords: LLM Efficiency
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches low compute settings-efficiency
Languages Studied: English
Submission Number: 8563
Loading