Z0-Inf: Zeroth Order Approximation for Data Influence

17 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Data Influence, Data Attribution
TL;DR: We present a zeroth-order approximation algorithm for data attribution
Abstract: Influence-functions are a popular tool for measuring the impact of a training point on a model’s prediction; in particular, self-influence, which quantifies the influence of a training point on itself, has found many uses such as data selection and outlier detection. However, the use of influence functions has been severely limited in modern models such as LLMs due to their low accuracy or high computational cost: most existing algorithms are either highly inaccurate, or require computing gradients or approximations to inverse Hessians, which can be prohibitive for large models. In this work, we introduce a highly efficient zeroth-order approximation to data influence that requires only a fraction of the time and memory footprint of previous methods and is applicable to both differentiable and non-differentiable loss functions. We demonstrate that in addition to its computational advantage, our method delivers superior accuracy in estimating self-influence and comparable or better accuracy in estimating train-test influence for fine-tuned large language models, paving the way for broader and more practical application of influence-function techniques in state-of-the-art AI systems.
Primary Area: interpretability and explainable AI
Submission Number: 9968
Loading