MAGIC: Near-Optimal Data Attribution for Deep Learning

ICLR 2026 Conference Submission16561 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: data attribution, data valuation, influence function
TL;DR: We present a method that nearly perfectly predicts the counterfactual effect of changing the training dataset on deep learning model.
Abstract: The goal of data attribution is to estimate how adding or removing a given set of training datapoints will affect model predictions. This goal is straightforward in convex settings, but in non-convex settings, existing methods are far less successful. Current methods' estimates often only weakly correlate with the ground truth. In this work, we present a new data attribution method (MAGIC) that combines classical methods and recent advances in metadifferentiation (Engstrom et al., 2025) to nearly optimally estimate the effect of adding or removing training data on model predictions at the cost of only 2-3x the cost of training a single model. MAGIC essentially "solves" data attribution as it is currently studied, thus enabling downstream applications and motivating more fine-grained future evaluations.
Primary Area: interpretability and explainable AI
Submission Number: 16561
Loading