Confirmation: our paper adheres to reproducibility best practices. In particular, we confirm that all important details required to reproduce results are described in the paper,, the authors agree to the paper being made available online through OpenReview under a CC-BY 4.0 license (https://creativecommons.org/licenses/by/4.0/), and, the authors have read and commit to adhering to the AutoML 2025 Code of Conduct (https://2025.automl.cc/code-of-conduct/).
Reproducibility: zip
TL;DR: Reformulating AutoML meta-learning targets from performance scores to rank positions allows simple regression models to more efficiently select pipelines—boosting ranking metrics and enabling easy integration with classic methods.
Abstract: Traditional approaches to pipeline selection in automated machine learning (AutoML) typically rely on predicting the absolute or relative performance scores of candidate pipelines for a given task, based on data acquired from previous tasks—i.e., meta-learning. This process can be complex due to the need for task-specific regression models and performance metrics. In contrast, rank-based methods estimate the relative ordering of pipelines, which aligns more directly with the decision-making nature of the selection task. Although ranking-based approaches have been explored previously, prior work often relies on computationally expensive pairwise comparisons or complex listwise formulations. In this study, we adopt a simpler alternative: reformulating the prediction target from absolute scores to rank positions—without modifying model architectures. This “ranking trick” enables the use of regression models while leveraging positional information. It is general and compatible with a wide range of existing AutoML techniques. Additionally, through controlled experiments, we show that these rank-based regression models are significantly less sensitive to noisy or overfitted meta-learning data, a common issue in practical AutoML settings. As a result, this approach enables more robust, metric-agnostic solutions and facilitates evaluation through ranking metrics such as NDCG and MRR. We evaluate this formulation across three large-scale OpenML benchmarks, demonstrating consistent advantages for ranking-based regression models. Furthermore, we explore its integration with Bayesian optimization and Monte Carlo Tree Search, yielding improved results in ranking quality. Finally, we identify a strong relationship between ranking-based metrics and key AutoML objectives such as final performance score and time-to-solution, providing guidance for AutoML practitioners.
Submission Number: 30
Loading