The Guessing Dilemma: Unveiling LLMs' Reasoning Changes Under Short-Path Prompting

ACL ARR 2025 May Submission5856 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Recent years have witnessed significant progress in large language models' (LLMs) reasoning, which is largely due to the chain-of-thought (CoT) approaches, allowing models to generate intermediate reasoning steps before reaching the final answer. Building on these advances, state-of-the-art LLMs are instruction-tuned to provide long and detailed CoT pathways when responding to reasoning-related questions. However, human beings are naturally cognitive misers and will prompt language models to give rather short responses, thus raising a significant conflict with CoT reasoning. In this paper, we delve into how LLMs' reasoning performance changes when users provide short-path prompts. The results and analysis reveal that instruct models can reason effectively and robustly without explicit CoT prompts, while under short-path prompting, LLM tend to guess the final answer and the reasoning ability becomes unstable, even on grade-school problems. Furthermore, we propose two approaches to explore whether the decision-making biases can be calibrated to prioritize reasoning accuracy, instead of overwhelming instruction following. Experimental results show that both methods could achieve high accuracy, providing insights into the trade-off between instruction following and reasoning accuracy in current models.
Paper Type: Long
Research Area: Question Answering
Research Area Keywords: prompting, reasoning, large language model
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 5856
Loading