Local Function Complexity for Active Learning via Mixture of Gaussian Processes

Danny Panknin; Stefan Chmiela; Klaus Robert Muller; Shinichi Nakajima

Local Function Complexity for Active Learning via Mixture of Gaussian Processes

Danny Panknin, Stefan Chmiela, Klaus Robert Muller, Shinichi Nakajima

Published: 08 Dec 2023, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Inhomogeneities in real-world data, e.g., due to changes in the observation noise level or variations in the structural complexity of the source function, pose a unique set of challenges for statistical inference. Accounting for them can greatly improve predictive power when physical resources or computation time is limited. In this paper, we draw on recent theoretical results on the estimation of local function complexity (LFC), derived from the domain of local polynomial smoothing (LPS), to establish a notion of local structural complexity, which is used to develop a model-agnostic active learning (AL) framework. Due to its reliance on pointwise estimates, the LPS model class is not robust and scalable concerning large input space dimensions that typically come along with real-world problems. Here, we derive and estimate the Gaussian process regression (GPR)-based analog of the LPS-based LFC and use it as a substitute in the above framework to make it robust and scalable. We assess the effectiveness of our LFC estimate in an AL application on a prototypical low-dimensional synthetic dataset, before taking on the challenging real-world task of reconstructing a quantum chemical force field for a small organic molecule and demonstrating state-of-the-art performance with a significantly reduced training demand.

Submission Length: Long submission (more than 12 pages of main content)

Previous TMLR Submission Url: https://openreview.net/forum?id=NxnBXFzez3

Changes Since Last Submission: # Summary of concerns In the previous submission, the editor summarized the reviewer's concerns as follows: - The overall paper was hard to read. - The setup of the paper was lacking clarity. - The central ideas of the paper require better presentation. - Appropriate baseline comparisons should be added to the experiments. - Claims and evidence must be matched In addition, the reviewers had individual requests: - A better explanation of the molecular dynamics simulation experiment. - A better review of the main active learning foundation. - More details on model training and hyperparameter selection in the main body of the paper. - Analytic forms of sparse Gaussian process regression models as well as inducing point selection methods should be compared and discussed. Finally, the reviewers had individual concerns: - The synopsis of the paper made the reader anticipate a more central role of heteroscedastic treatment than present - Knowledge about smoothness of the target function and intrinsic dimension of the input space is not always given (or not straightforward to estimate) in practical settings # Changes since last submission In our revised manuscript, we addressed all the above issues, which led to plentiful small changes that are too tedious to list. On a high-level perspective, we summarize the major changes as follows: We reorganized the structure of the paper to make it easier to read and improve clarity. In particular, - We moved discussion and related work to new, separate sections. Accordingly, we relocated those parts of discussion and related work that were formerly addressed in the introduction away to ease the reading flow. - We improved the description of the experimental setup (including the molecular dynamics simulation experiment). - We added a review of the active learning foundation to the preliminaries section. - We moved details on model training from the appendix to the main body of the paper. - We added a clear definition of the active learning performance measure, and what it means for a training data selection to be superior. - We point out that any version of Gaussian processes for the experts can be used. In this regard, we added a discussion on analytic Gaussian processes variants. These are now also covered in the provided code (see below). We revised the scope and applicability of our work to address the mismatch between claims and evidence. In particular, - We delimited the considered setting to model-agnostic active learning at large training sizes. In this regard, we emphasize the complementary role of our work to the majority of active learning literature that may be preferable in other active learning scenarios. - We emphasize that heteroscedasticity is not the focus of our work. Yet, we analyze the heteroscedastic treatment of our model and the active learning framework and demonstrate our findings on toy data. - We discuss intrinsic dimension estimation (referring to related work) - We added limitations regarding unknown smoothness of the target function We added baseline comparisons regarding active learning and inducing point selection. Within the considered active learning setting we built our work on a state-of-the-art method. Hence, state-of-the-art performance within the set of relevant competitors from related work is implied. Regarding reviewers' concerns about practicability of our work, including the deployed mixture of Gaussian processes model, - We now provide anonymized python code of the model and the active learning framework, which will be fully disclosed upon acceptance of the paper. - We added instructions on the hyperparameter selection of the model.

Code: https://github.com/DPanknin/modelagnostic_superior_training

Assigned Action Editor: ~Lijun_Zhang1

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Submission Number: 1485

Loading