Sparser, Better, Faster, Stronger: Sparsity Detection for Efficient Automatic Differentiation

Adrian Hill; Guillaume Dalle

Sparser, Better, Faster, Stronger: Sparsity Detection for Efficient Automatic Differentiation

Adrian Hill, Guillaume Dalle

Published: 05 Jun 2025, Last Modified: 05 Jun 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: From implicit differentiation to probabilistic modeling, Jacobian and Hessian matrices have many potential use cases in Machine Learning (ML), but they are viewed as computationally prohibitive. Fortunately, these matrices often exhibit sparsity, which can be leveraged to speed up the process of Automatic Differentiation (AD). This paper presents advances in sparsity detection, previously the performance bottleneck of Automatic Sparse Differentiation (ASD). Our implementation of sparsity detection is based on operator overloading, able to detect both local and global sparsity patterns, and supports flexible index set representations. It is fully automatic and requires no modification of user code, making it compatible with existing ML codebases. Most importantly, it is highly performant, unlocking Jacobians and Hessians at scales where they were considered too expensive to compute. On real-world problems from scientific ML, graph neural networks and optimization, we show significant speed-ups of up to three orders of magnitude. Notably, using our sparsity detection system, ASD outperforms standard AD for one-off computations, without amortization of either sparsity detection or matrix coloring.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: We provide a latexdiff document highlighting the changes since the last revision: https://anonymous.4open.science/api/repo/sparse-differentiation-paper/file/tracer_paper-diff7e21514.pdf

Assigned Action Editor: ~Pierre_Ablin2

Submission Number: 4046

Loading