Bayesian Influence Functions for Scalable Data Attribution

Philipp Alexander Kreer; Wilson Wu; Maxwell Adam; Zach Furman; Jesse Hoogland

Bayesian Influence Functions for Scalable Data Attribution

Philipp Alexander Kreer, Wilson Wu, Maxwell Adam, Zach Furman, Jesse Hoogland

Published: 09 Jun 2025, Last Modified: 09 Jul 2025HiLD at ICML 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Interpretability, SGMCMC, Influence Functions, MCMC, Complexity, Loss Landscape, Geometry, Singular Learning Theory

TL;DR: We introduce a Bayesian generalization of influence functions that scales to models with billions of parameters.

Abstract: Classical influence functions face significant challenges when applied to deep neural networks, primarily due to singular Hessians and high-dimensional parameter spaces. We propose the local Bayesian influence function, an extension of classical influence functions that replaces Hessian inversion with loss landscape statistics that can be estimated via stochastic Gradient MCMC. This approach captures higher-order interactions among parameters and scales efficiently to neural networks with billions of parameters. Initial results on language and vision models indicate performance comparable to state-of-the-art methods like EK-FAC, often with substantially reduced computational costs.

Student Paper: Yes

Submission Number: 49

Loading