ProcessLID: Step-wise Internal Reward in LLM Reasoning via Local Intrinsic Dimension

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Model, Mathematical Reasoning, Intrinsic Dimension
TL;DR: We propose a process-level, internal signal to detect correctness in LLM reasoning trajectory. The method achieved state-of-the-art performance across six models and four mathematical datasets compared to existing methods.
Abstract: Accurate step-level correctness signals are key to reliable LLM mathematical reasoning. Prior work either invokes external judges or Process Reward Models at inference time, incurring heavy compute, or leverages internal representations but yields outcome-level signals without step-wise granularity. We introduce $\mathbf{ProcessLID}$, a training-free, representation-based method grounded in local intrinsic dimension that produces step-level correctness signals. Across six models on four math benchmarks, ProcessLID attains the state-of-the-art step-level and outcome-level performance and remains competitive in inference-time settings. Lastly, we provide analyses explaining the effectiveness of our methodology.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 23484
Loading