LassoBench: A High-Dimensional Hyperparameter Optimization Benchmark Suite for LassoDownload PDF

25 Feb 2022, 12:35 (modified: 16 Jul 2022, 13:35)AutoML-Conf 2022 (Main Track)Readers: Everyone
Abstract: While Weighted Lasso sparse regression has appealing statistical guarantees that would entail a major real-world impact in finance, genomics, and brain imaging applications, it is typically scarcely adopted due to its complex high-dimensional space composed by thousands of hyperparameters. On the other hand, the latest progress with high-dimensional hyperparameter optimization (HD-HPO) methods for black-box functions demonstrates that high-dimensional applications can indeed be efficiently optimized. Despite this initial success, HD-HPO approaches are mostly applied to synthetic problems with a moderate number of dimensions, which limits its impact in scientific and engineering applications. We propose LassoBench, the first benchmark suite tailored for Weighted Lasso regression. LassoBench consists of benchmarks for both well-controlled synthetic setups (number of samples, noise level, ambient and effective dimensionalities, and multiple fidelities) and real-world datasets, which enables the use of many flavors of HPO algorithms to be studied and extended to the high-dimensional Lasso setting. We evaluate 6 state-of-the-art HPO methods and 3 Lasso baselines, and demonstrate that Bayesian optimization and evolutionary strategies can improve over the methods commonly used for sparse regression while highlighting limitations of these frameworks in very high-dimensional and noisy settings.
Keywords: Bayesian optimization, Weighted Lasso regression, high-dimensional hyperparameter optimization
One-sentence Summary: LassoBench based on Weighted Lasso regression provides a platform for newly proposed high-dimensional HPO algorithms to be easily tested on different synthetic and real-world problems.
Track: Special track for systems, benchmarks and challenges
Reproducibility Checklist: Yes
Broader Impact Statement: Yes
Paper Availability And License: Yes
Code Of Conduct: Yes
Reviewers: Kenan Šehić,
Main Paper And Supplementary Material: pdf
CPU Hours: 0
GPU Hours: 0
TPU Hours: 0
Class Of Approaches: Bayesian Optimization, Evolutionary Methods, Sparse regression
Datasets And Benchmarks: LIBSVM
Performance Metrics: MSE
7 Replies