Keywords: LLM Reasoning, Efficient LLM
Abstract: Recent large language models with chain-of-thought reasoning capabilities exhibit poor token efficiency due to Hesitation - spending excessive tokens verifying already-correct answers. Using our Probe-In-The-Middle technique to analyze model states during reasoning, we propose Dynasor-CoT, a certainty-based approach for dynamic reasoning termination. Our training-free method efficiently achieves up to 29% token reduction while maintaining accuracy across mathematical reasoning tasks like AMC23, AIME24, and MATH500.
Submission Number: 96
Loading