Keywords: Automatic differentiation, Numerical Linear Algebra, Constrained Optimization, Implicit Differentiation, Gaussian Process
TL;DR: This paper presents a high-performance, differentiable least-squares solver that can be used like a neural network layer and demonstrates its usefulness by enforcing arbitrary constraints in neural networks and calibrating Gaussian processes.
Abstract: This paper argues that the method of least squares has significant unfulfilled potential in modern machine learning, far beyond merely being a tool for fitting linear models. To release its potential, we derive custom gradients that transform the solver into a differentiable operator, like a neural network layer, enabling many diverse applications. Empirically, we demonstrate: (i) scalability by enforcing weight sparsity on a 50 million parameter model; (ii) imposing conservativeness constraints in score-based generative models; and (iii) hyperparameter tuning of Gaussian processes based on predictive performance. By doing this, our work represents the next iteration in developing differentiable linear-algebra tools and making them widely accessible to machine learning practitioners.
Primary Area: neurosymbolic & hybrid AI systems (physics-informed, logic & formal reasoning, etc.)
Submission Number: 16273
Loading