Functional Bilevel Optimization for Machine Learning

Ieva Petrulionytė; Julien Mairal; Michael Arbel

Functional Bilevel Optimization for Machine Learning

Ieva Petrulionytė, Julien Mairal, Michael Arbel

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 spotlightEveryoneRevisionsBibTeXCC BY 4.0

Keywords: bilevel optimization, functional optimization, adjoint method, neural networks

TL;DR: A functional perspective on bilevel optimization for deep learning applications.

Abstract: In this paper, we introduce a new functional point of view on bilevel optimization problems for machine learning, where the inner objective is minimized over a function space. These types of problems are most often solved by using methods developed in the parametric setting, where the inner objective is strongly convex with respect to the parameters of the prediction function. The functional point of view does not rely on this assumption and notably allows using over-parameterized neural networks as the inner prediction function. We propose scalable and efficient algorithms for the functional bilevel optimization problem and illustrate the benefits of our approach on instrumental regression and reinforcement learning tasks.

Primary Area: Optimization (convex and non-convex, discrete, stochastic, robust)

Submission Number: 6898

Loading