Thermometer of Thoughts: Enhancing LLM's Exploration via Attention Temperature Modulation

Thermometer of Thoughts: Enhancing LLM's Exploration via Attention Temperature Modulation

ACL ARR 2026 January Submission2658 Authors

03 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Model, Reasoning

Abstract: Improving the exploration of reasoning is essential for advancing Large Language Models' (LLMs) problem-solving performance. Current methods primarily rely on output-level stochasticity, which decode within fixed reasoning patterns of LLM and suffer from insufficient exploration. In this paper, we introduce adjusting attention temperature to directly modulate the model's internal focus during reasoning, which enables a dynamic shift between exploratory and focused processing. We find that moderate adjustments preserve LLM's reasoning capability while producing problem-dependent benefits: higher temperatures facilitate solving complex tasks by encouraging wider exploration, whereas lower temperatures mitigate overthinking on simpler problems. Leveraging this insight, we propose a two-stage inference strategy: first, attention temperature scaling modulates the LLM's reasoning patterns to diversify the reasoning traces; then, a difficulty-aware aggregation scheme is applied to effectively identify the most reliable solution from the generated candidates. Extensive evaluations show that our method improves Pass@10 by 9.95--16.9% and aggregation accuracy by 7.2% across 7 reasoning benchmarks.

Paper Type: Long

Research Area: Discourse, Pragmatics, and Reasoning

Research Area Keywords: Question Answering

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 2658

Loading