LMEMs for post-hoc analysis of HPO Benchmarking

Published: 12 Jul 2024, Last Modified: 14 Aug 2024AutoML 2024 WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Hyperparameter Optimization, Benchmarking, Significance testing
TL;DR: Applying model-based significance testing for flexible post-hoc analysis of HPO Benchmarking data.
Abstract: The importance of tuning hyperparameters in Machine Learning (ML) and Deep Learning (DL) is established through empirical research and applications, evident from the increase in new hyperparameter optimization (HPO) algorithms and benchmarks steadily added by the community. However, current benchmarking practices using averaged performance across many datasets may obscure key differences between HPO methods, especially for pairwise comparisons. In this work, we apply Linear Mixed-Effect Models-based (LMEMs) significance testing for post-hoc analysis of HPO benchmarking runs. LMEMs allow flexible and expressive modeling on the entire experiment data, including information such as benchmark meta-features, offering deeper insights than current analysis practices. We demonstrate this through a case study on the PriorBand paper's experiment data to find insights not reported in the original work.
Submission Checklist: Yes
Broader Impact Statement: Yes
Paper Availability And License: Yes
Code Of Conduct: Yes
Optional Meta-Data For Green-AutoML: All questions below on environmental impact are optional.
Submission Number: 21
Loading