Understanding Chain-of-Thought in LLMs through Information Theory

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We formalize a framework for evaluating Chain of thought reasoning using Information theory, The proposed method allows us to detect failure modes in LLM reasoning at a better rate than existing methods.
Abstract: Large Language Models (LLMs) have shown impressive performance in complex reasoning tasks through the use of Chain-of-Thought (CoT) reasoning, allowing models to break down problems into manageable sub-tasks. However, existing CoT evaluation techniques either require annotated CoT data or fall short of accurately assessing intermediate reasoning steps, leading to high rates of false positives. In this paper, we formalize CoT reasoning in LLMs through an information-theoretic lens. Specifically, our framework quantifies the `information gain' at each reasoning step, enabling the identification of failure modes in LLMs without the need for expensive annotated datasets. We demonstrate the efficacy of our approach through extensive experiments on toy arithmetic, GSM8K and PRM800k datasets, where it significantly outperforms existing outcome-based methods by providing more accurate insights into model performance on individual tasks.
Lay Summary: **Problem:** When AI language models solve complex problems, they break them down into step-by-step reasoning called "Chain-of-Thought." However, we currently have no reliable way to identify which specific steps in this reasoning process are incorrect without expensive human annotation of every step. Existing methods often give false alarms, incorrectly flagging correct reasoning steps as wrong. **Solution:** We developed a mathematical framework using information theory to automatically detect reasoning errors. Our method measures how much useful information each reasoning step contributes toward the correct final answer. When a step fails to add meaningful information (or even reduces confidence in the correct answer), this signals an error in the model's reasoning process. We train a separate "supervisor" model that can assess whether each step brings the AI closer to the right solution. **Impact:** Our approach can pinpoint exactly where AI reasoning goes wrong without requiring humans to manually check every step, making it much more practical and cost-effective than current methods. This enables researchers and developers to identify specific weaknesses in AI reasoning systems and improve them more efficiently. As AI systems become more complex and are used in critical applications, having reliable tools to evaluate and improve their reasoning processes becomes essential for building trustworthy AI.
Primary Area: Deep Learning->Large Language Models
Keywords: Large Language Models, Chain-of-Thought, Information Theory
Submission Number: 10908
Loading