Revealing Procedural Reasoning Structures in Chain-of-Thought Training via Span-Level Gradient Organization

Revealing Procedural Reasoning Structures in Chain-of-Thought Training via Span-Level Gradient Organization

ACL ARR 2026 January Submission4820 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Explainability, Training Dynamics, Chain-of-Thought, Gradient-Based Analysis, Procedural Reasoning, Span-Level Gradients, Model Interpretability

Abstract: Chain-of-Thought (CoT) prompting enables large language models to produce multi-step reasoning, yet how such reasoning-related structure is expressed during training remains poorly understood. We present Gradient-based Structural Developer (GSD), an unsupervised framework with a principled gradient aggregation view that tracks span-level gradient during fine-tuning on reasoning benchmarks to understand how models develop structured, step-by-step reasoning capabilities. Our analysis shows that while gradients at the level of individual tokens are often noisy, aggregating gradients over contiguous reasoning-related spans reveals stable and recurring directional alignment across samples. We refer to these directionally aligned patterns as aligned sequential stresses, reflecting consistent gradient organization associated with similar reasoning procedures. Beyond capturing semantically similar reasoning instances, such gradient alignment also reveals structurally similar but semantically diverse cases that share common procedural organization. These findings position GSD as an explainability framework. It makes the internal formation of procedural reasoning during training transparent, enabling human-understandable analysis and diagnosis of how reasoning-oriented behaviors emerge in language models.

Paper Type: Long

Research Area: Special Theme (conference specific)

Research Area Keywords: Interpretability and Analysis of Models for NLP, Machine Learning for NLP, Language Modeling

Contribution Types: Model analysis & interpretability

Languages Studied: English

Submission Number: 4820

Loading