Pruning via Causal Attribution Preserves Reasoning in Large Language Models

Amogh Sheth; Andrew Lin; Yi Wen Huang; Biruk Assefa; Yuhao Ge

Pruning via Causal Attribution Preserves Reasoning in Large Language Models

Amogh Sheth, Andrew Lin, Yi Wen Huang, Biruk Assefa, Yuhao Ge

Published: 05 Mar 2026, Last Modified: 25 Apr 2026ICLR 2026 Workshop LLM ReasoningEveryoneRevisionsBibTeXCC BY 4.0

Track: long paper (up to 10 pages)

Keywords: large language models, training-free pruning, causal attribution, unstructured sparsity, chain-of-thought, reasoning, attention heads, model compression

TL;DR: We propose Causal Attribution Pruning (CAP), a training-free method that uses causal head-masking on a small reasoning calibration set to guide weight sparsification and better preserve LLM reasoning than magnitude-based pruning at 10-20% sparsity.

Abstract: Large language models (LLMs) excel at multi-step reasoning but incur substantial inference cost. We introduce Causal Attribution Pruning (CAP), a training-free method that identifies critical attention heads by measuring their causal impact on reasoning tasks and uses these head-level scores to guide fine-grained weight pruning. For each attention head, CAP estimates the expected performance degradation when the head is masked during forward passes on a small calibration set of reasoning problems. These causal scores are then converted into weight-level importance values for the corresponding projection matrices. Unlike magnitude-only or activation-based criteria, CAP’s interventional measurement directly captures each head’s functional contribution, yielding relative accuracy gains of up to 61% over Wanda on ARC-Challenge at 20% sparsity. We evaluate CAP on GSM8K, StrategyQA, and ARC-Challenge using Llama-3-8B-Instruct and Mistral-7B-Instruct at 10%, 20%, and 50% sparsity. At moderate sparsity (10–20%), CAP improves over Wanda in most model–benchmark configurations, with especially large gains on ARC-Challenge for Llama-3. Our results suggest that attention-head-level causal attribution can better preserve reasoning performance on downstream benchmarks than correlational pruning criteria at equivalent sparsity, while remaining limited by coarse MLP attribution at 50% sparsity.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.

Submission Number: 44

Loading