ASTRA: Statistically Robust Model Selection from Cross-Validation

Wojtek Treyde; Fernanda Duarte

ASTRA: Statistically Robust Model Selection from Cross-Validation

Wojtek Treyde, Fernanda Duarte

Published: 02 Mar 2026, Last Modified: 09 Apr 2026AI4Mat-ICLR-2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: cross-validation, statistical hypothesis testing, software

TL;DR: ASTRA combines model training using cross-validation with statistical testing to identify significantly better performing models.

Abstract: Current standard practices for comparing machine learning models in low-data regimes, common in materials discovery, lack statistical rigour. We present Automated model selection using Statistical Testing for Robust Algorithms (ASTRA), which combines model training using cross-validation (CV) with statistical hypothesis testing to identify significantly better performing models. Evaluating ASTRA on hundreds of synthetic data sets and real-life drug discovery data sets from the ASAP Discovery x OpenADMET challenge shows that it selects better models than choosing the model with the best mean or median CV score, in particular in classification settings and when CV scores do not correlate significantly with test performance. ASTRA will make it easier to develop new approaches that significantly outperform previous models, and its modular and customisable design allows users to seamlessly integrate it into existing machine learning workflows. ASTRA is freely available in a GitHub repository.

Submission Track: Full Paper

Submission Category: AI-Guided Design

Supplementary Material: zip

Submission Number: 26

Loading