Less is Not Worse: Effective Reasoning Without Complete Reasoning Traces

08 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: reasoning trajectory, large language model, supervised fine-tuning
Abstract: Large language models (LLMs) often produce lengthy reasoning traces with substantial token redundancy. While reasoning processes are widely adopted to tune LLMs as a post-training regime, it has been underexplored whether LLMs truly learn from the complete trajectory, particularly in supervised fine-tuning (SFT). We argue that, for mid-size LLMs commonly trained with SFT for reasoning, using full reasoning trajectories may harm performance because their limited capacity increases susceptibility to redundant intermediate steps. To investigate, we first analyze the redundancy in thinking trajectories through attention maps and controlled token-removal studies, both of which show that intermediate tokens contribute minimally to reasoning quality. Our analyses suggest that the most redundant segments typically appear in the middle of reasoning traces, whereas the earlier and later segments are crucial for generating high-quality final outcomes. We further posit that avoiding redundant intermediate information leads to exploiting the capability of LLMs to infer concise and coherent intermediate steps by utilizing the known start and end points. Based on the insights, we propose MidCut, a method that removes redundant middle steps during both training and inference. We demonstrate the effectiveness of MidCut in two scenarios for LLM reasoning: (1) SFT trained on s1K and OpenThoughts datasets for reasoning; and (2) decoding strategy for a test-time application.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 3081
Loading