GraSS: Scalable Data Attribution with Gradient Sparsification and Sparse Projection

Pingbang Hu; Joseph Melkonian; Weijing Tang; Han Zhao; Jiaqi W. Ma

GraSS: Scalable Data Attribution with Gradient Sparsification and Sparse Projection

Pingbang Hu, Joseph Melkonian, Weijing Tang, Han Zhao, Jiaqi W. Ma

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: data attribution, influence function, gradient compression

TL;DR: Propose a scalable gradient compression algorithm for data attribution with sub-linear complexity that achieves competitive attribution results.

Abstract: Gradient-based data attribution methods, such as influence functions, are critical for understanding the impact of individual training samples without requiring repeated model retraining. However, their scalability is often limited by the high computational and memory costs associated with per-sample gradient computation. In this work, we propose **GraSS**, a novel gradient compression algorithm and its variants **FactGraSS** for linear layers specifically, that explicitly leverage the inherent sparsity of per-sample gradients to achieve sub-linear space and time complexity. Extensive experiments demonstrate the effectiveness of our approach, achieving substantial speedups while preserving data influence fidelity. In particular, **FactGraSS** achieves up to 165% faster throughput on billion-scale models compared to the previous state-of-the-art baselines. Our code is publicly available at https://github.com/TRAIS-Lab/GraSS.

Supplementary Material: zip

Primary Area: Social and economic aspects of machine learning (e.g., fairness, interpretability, human-AI interaction, privacy, safety, strategic behavior)

Submission Number: 22982

Loading