TL;DR: We consider the problem of regularly tuning an ML system in a production system and show that state-of-the-art HPO transfer learning can be easily outperformed by taking the order into account.
Abstract: We introduce ordered transfer hyperparameter optimisation (OTHPO), a version of transfer learning for hyperparameter optimisation (HPO) where the tasks follow a sequential order. Unlike for state-of-the-art transfer HPO, the assumption is that each task is most correlated to those immediately before it. This matches many deployed settings, where hyperparameters are retuned as more data is collected; for instance tuning a sequence of movie recommendation systems as more movies and ratings are added. We propose a formal definition, outline the differences to related problems and propose a basic OTHPO method that outperforms state-of-the-art transfer HPO. We empirically show the importance of taking order into account using ten benchmarks. The benchmarks are in the setting of gradually accumulating data, and span XGBoost, random forest, approximate k-nearest neighbor, elastic net, support vector machines and a separate real-world motivated optimisation problem. We open source the benchmarks to foster future research on ordered transfer HPO.
Keywords: Hyperparameter optimisation, Transfer Learning, Bayesian Optimization, Continual Learning, HPO deployment.
Submission Checklist: Yes
Broader Impact Statement: Yes
Paper Availability And License: Yes
Code Of Conduct: Yes
Code And Dataset Supplement: zip
Steps For Environmental Footprint Reduction During Development: We use tabular and surrogate benchmarks to evaluate multiple method seeds which significantly reduced our compute footprint.
CPU Hours: 311
GPU Hours: 0
TPU Hours: 0
Evaluation Metrics: No