Abstract: Gaussian process regression (GPR) or kernel ridge regression is a widely used and powerful tool for nonlinear prediction. Therefore, active learning (AL) for GPR, which actively collects data labels to achieve an accurate prediction with fewer data labels, is an important problem. However, existing AL methods do not theoretically guarantee prediction accuracy for target distribution. Furthermore, as discussed in the distributionally robust learning literature, specifying the target distribution is often difficult. Thus, this paper proposes two AL methods that effectively reduce the worst-case expected error for GPR, which is the worst-case expectation in target distribution candidates. We show an upper bound of the worst-case expected squared error, which suggests that the error will be arbitrarily small by a finite number of data labels under mild conditions. Finally, we demonstrate the effectiveness of the proposed methods through synthetic and real-world datasets.
Lay Summary: Collecting data often requires a significant amount of time and financial resources. In such cases, it is important to collect data so that the machine learning model has good predictive performance with as little data as possible. Active learning is a methodology for collecting data for such a purpose. If the target to be predicted is known, it is effective to collect data to reduce the prediction error for that target. However, in general, it is unclear how the target to be predicted is specified. For example, multiple users may apply a predictive model to different targets.
In this study, we propose active learning methods that aim to improve prediction performance for all given candidates of target probability distributions that generate the target to be predicted. Furthermore, we demonstrate the theoretical validity of the proposed methods for a widely used machine learning model called Gaussian process regression. Therefore, our proposed methods enable the construction of a general-purpose and accurate machine learning model for various target probability distributions with reduced data collection costs.
Primary Area: General Machine Learning->Online Learning, Active Learning and Bandits
Keywords: Gaussian process, Active learning, distributionally robust, experimental design
Submission Number: 3501
Loading