A Fresh Look at De Novo Molecular Design Benchmarks

Austin Tripp; Gregor N. C. Simm; José Miguel Hernández-Lobato

A Fresh Look at De Novo Molecular Design Benchmarks

Austin Tripp, Gregor N. C. Simm, José Miguel Hernández-Lobato

Published: 22 Oct 2021, Last Modified: 05 May 2023NeurIPS-AI4Science PosterReaders: Everyone

Keywords: De novo molecular design, optimization, molecular optimization, chemistry, benchmark

TL;DR: We perform many de novo design benchmarks with different budgets and dataset sizes to determine their effect on performance.

Abstract: De novo molecular design is a thriving research area in machine learning (ML) that lacks ubiquitous, high-quality, standardized benchmark tasks. Many existing benchmark tasks do not precisely specify a training dataset or an evaluation budget, which is problematic as they can significantly affect the performance of ML algorithms. This work elucidates the effect of dataset sizes and experimental budgets on established molecular optimization methods through a comprehensive evaluation with 11 selected benchmark tasks. We observe that the dataset size and budget significantly impact all methods' performance and relative ranking, suggesting that a meaningful comparison requires more than a single benchmark setup. Our results also highlight the relative difficulty of benchmarks, implying in particular that logP and QED are poor objectives. We end by offering guidance to researchers on their choice of experiments.

Track: Original Research Track

1 Reply

Loading