Keywords: Interpretability, SGMCMC, Influence Functions, MCMC, Complexity, Loss Landscape, Geometry, Singular Learning Theory
TL;DR: We introduce a Bayesian generalization of influence functions that scales to models with billions of parameters.
Abstract: Classical influence functions face significant challenges when applied to deep neural networks, primarily due to singular Hessians and high-dimensional parameter spaces. We propose the local Bayesian influence function, an extension of classical influence functions that replaces Hessian inversion with loss landscape statistics that can be estimated via stochastic Gradient MCMC. This approach captures higher-order interactions among parameters and scales efficiently to neural networks with billions of parameters. Initial results on language and vision models indicate performance comparable to state-of-the-art methods like EK-FAC, often with substantially reduced computational costs.
Student Paper: Yes
Submission Number: 49
Loading