CS4ML: A general framework for active learning with arbitrary data based on Christoffel functions

Published: 21 Sept 2023, Last Modified: 02 Nov 2023NeurIPS 2023 spotlightEveryoneRevisionsBibTeX
Keywords: active learning, regression, arbitrary data, leverage scores, Christoffel functions, generative models, Magnetic Resonance Imaging (MRI), Physics-Informed Neural Networks (PINNs)
TL;DR: We introduce a general framework for active learning in regression problems that allows for general types of data and optimal sampling via so-called generalized Christoffel functions.
Abstract: We introduce a general framework for active learning in regression problems. Our framework extends the standard setup by allowing for general types of data, rather than merely pointwise samples of the target function. This generalization covers many cases of practical interest, such as data acquired in transform domains (e.g., Fourier data), vector-valued data (e.g., gradient-augmented data), data acquired along continuous curves, and, multimodal data (i.e., combinations of different types of measurements). Our framework considers random sampling according to a finite number of sampling measures and arbitrary nonlinear approximation spaces (model classes). We introduce the concept of \textit{generalized Christoffel functions} and show how these can be used to optimize the sampling measures. We prove that this leads to near-optimal sample complexity in various important cases. This paper focuses on applications in scientific computing, where active learning is often desirable, since it is usually expensive to generate data. We demonstrate the efficacy of our framework for gradient-augmented learning with polynomials, Magnetic Resonance Imaging (MRI) using generative models and adaptive sampling for solving PDEs using Physics-Informed Neural Networks (PINNs).
Supplementary Material: zip
Submission Number: 8293
Loading