Keywords: Large Language Models, Chain-of-Thought Reasoning, Interpretability, Uncertainty Quantification
Abstract: Understanding the internal reasoning processes of Large Language Models (LLMs) when confronted with complex challenges represents a core problem for interpretability research. This paper introduces a novel diagnostic probe—the Conditional Pivotal Reasoning Ratio (CPRR)—to reveal a fundamental characteristic of LLM reasoning dynamics. CPRR captures a phenomenon we term "confident uncertainty" by quantifying the model's propensity to engage in statistically surprising (high-perplexity) exploration when making high-confidence decisions.
Through an analysis of tens of thousands of reasoning paths from two LLMs with distinct training histories on the AIME mathematical competition dataset, we identify a robust pattern: on problems the models find difficult, successful reasoning paths exhibit significantly higher CPRR during the crucial initial planning phase than do failing paths. This "peak in thinking" is absent in simpler problems. The existence of this quantifiable probabilistic "signature" reveals that effective reasoning begins with a more intense initial exploration. We further substantiate through qualitative analysis that the high-frequency tokens identified by CPRR semantically constitute a "cognitive toolkit" for solving difficult problems.
This research provides a new analytical dimension for understanding how LLMs "think," shifting the focus of inquiry from the static correctness of final answers to an analysis of the dynamic, and at times noisy, reasoning process itself.
Primary Area: interpretability and explainable AI
Submission Number: 15110
Loading