Rethinking External Slow-Thinking: From Snowball Errors to Probability of Correct Reasoning

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: This paper theoretically analyzes external slow-thinking methods in LLMs, linking snowball errors to reasoning accuracy and providing insights to enhance the interpretability of existing approaches.
Abstract: Test-time scaling, which is also often referred to as *slow-thinking*, has been demonstrated to enhance multi-step reasoning in large language models (LLMs). However, despite its widespread utilization, the mechanisms underlying slow-thinking methods remain poorly understood. This paper explores the mechanisms of external slow-thinking from a theoretical standpoint. We begin by examining the snowball error effect within the LLM reasoning process and connect it to the likelihood of correct reasoning using information theory. Building on this, we show that external slow-thinking methods can be interpreted as strategies to mitigate the error probability. We further provide a comparative analysis of popular external slow-thinking approaches, ranging from simple to complex, highlighting their differences and interrelationships. Our findings suggest that the efficacy of these methods is not primarily determined by the specific framework employed, and that expanding the search scope or the model's internal reasoning capacity may yield more sustained improvements in the long term. We open-source our code at https://github.com/ZyGan1999/Snowball-Errors-and-Probability.
Lay Summary: LLMs often perform better on complex, step-by-step reasoning tasks when using techniques known as test-time scaling or "slow-thinking." These methods are widely used because they boost accuracy, but the fundamental reason why they are effective has been poorly understood. To uncover why slow-thinking, especially those without extra training (we call external slow-thinking), works, our research delves into the underlying mechanics. We introduced and analyzed the concept of a "snowball error effect," where small initial mistakes can accumulate and grow during the LLM's reasoning process. Using information theory, we show that external slow-thinking methods can be understood as strategies that effectively reduce the probability of these errors escalating. We also compared several popular slow-thinking approaches based on this understanding. This work provides a theoretical explanation for how making LLMs "think slower" improves their reasoning. Our findings suggest that the specific technique used might be less critical. For more significant long-term gains in LLM reasoning, focusing on expanding the LLM's search for solutions or enhancing its core internal reasoning capacity might be more fruitful directions.
Link To Code: https://github.com/ZyGan1999/Snowball-Errors-and-Probability
Primary Area: Deep Learning->Large Language Models
Keywords: large language models, test-time scaling, information theory
Submission Number: 5407
Loading