Abstract: The bias-variance trade-off is a well-known problem in machine learning that only gets more pronounced the less available data there is. When data is scarce, such as in metamodeling, active learning, and Bayesian optimization, neglecting this trade-off can cause inefficient and non-optimal querying, leading to unnecessary data labeling. In this paper, we focus on metamodeling with active learning and the canonical Gaussian Process (GP). We recognize that, for the GP, the bias-variance trade-off regulation is made by optimization of the two hyperparameters: the length scale and noise-term. Considering that the optimal mode of the joint posterior of the hyperparameters is equivalent to the optimal bias-variance trade-off, we approximate this joint posterior and utilize it to design two new acquisition functions. The first one is a mode-seeking Bayesian variant of Query-by-Committee (B-QBC), and the second is simultaneously mode-seeking and minimizing the predictive variance through a Query by Mixture Gaussian Processes (QB-MGP) formulation. Across seven simulators, we empirically show that B-QBC outperforms the benchmark functions, whereas QB-MGP is the most robust acquisition function and achieves the best accuracy with the fewest iterations. We generally show that incorporating the bias-variance trade-off in the acquisition functions mitigates unnecessary and expensive data labeling.
Supplementary Material: zip
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2205.10186/code)
11 Replies
Loading