How Weight Pruning Destroys Chain-of-Thought Reasoning in Language Reasoning Models: A Model Similarity and Faithfulness Correlation Analysis

Published: 16 Oct 2025, Last Modified: 10 Nov 2025NeurIPS 2025 ER WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Weight-space similarity, Sparsity-Adjusted Normalized Distance, Neural network pruning, Reasoning faithfulness, Chain-of-thought degradation
Abstract: Efficient reasoning under compute and memory constraints is critical for deploying large language models (LLMs) in real-world scenarios.We propose a framework to quantify the relationship between model similarity and faithfulness degradation under pruning, introducing ASAND, a similarity metric that combines centered alignment, sparsity-aware structural measures, and adaptive exponential decay to predict non-monotonic changes in reasoning fidelity. Experiments on Qwen-0.5B with GSM8K show that light pruning can improve chain-of-thought (CoT) reasoning, while aggressive sparsity causes catastrophic collapse. Correlation analyses indicate that ASAND outperforms standard similarity metrics, achieving the highest predictive power for faithfulness degradation. These results provide actionable insights for efficient, compression-aware deployment of LLMs, highlighting strategies to maintain reasoning integrity on resource-constrained devices.
Submission Number: 182
Loading