Thermometer of Thoughts: Enhancing LLM's Exploration via Attention Temperature Modulation

ACL ARR 2026 January Submission2658 Authors

03 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Model, Reasoning
Abstract: Improving the exploration of reasoning is essential for advancing Large Language Models' (LLMs) problem-solving performance. Current methods primarily rely on output-level stochasticity, which decode within fixed reasoning patterns of LLM and suffer from insufficient exploration. In this paper, we introduce adjusting attention temperature to directly modulate the model's internal focus during reasoning, which enables a dynamic shift between exploratory and focused processing. We find that moderate adjustments preserve LLM's reasoning capability while producing problem-dependent benefits: higher temperatures facilitate solving complex tasks by encouraging wider exploration, whereas lower temperatures mitigate overthinking on simpler problems. Leveraging this insight, we propose a two-stage inference strategy: first, attention temperature scaling modulates the LLM's reasoning patterns to diversify the reasoning traces; then, a difficulty-aware aggregation scheme is applied to effectively identify the most reliable solution from the generated candidates. Extensive evaluations show that our method improves Pass@10 by 9.95--16.9% and aggregation accuracy by 7.2% across 7 reasoning benchmarks.
Paper Type: Long
Research Area: Discourse, Pragmatics, and Reasoning
Research Area Keywords: Question Answering
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 2658
Loading