Diverse Hits in de Novo Molecule Design: A Diversity-based Comparison of Goal-directed Generators

Published: 04 Mar 2024, Last Modified: 29 Apr 2024GEM PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Machine learning: computational method and/or computational results
Cell: I do not want my work to be considered for Cell Systems
Keywords: generative models, molecular design, molecular sciences, drug design
TL;DR: We present a diversity-focused comparative study of goal-directed generative models for molecules
Abstract: Goal-directed molecular generators have been proposed as a solution to discover novel drug candidates, but often are prone to ”mode collapse”, which is when they only generate a limited number of similar compounds. The need to generate a diverse set of desired molecules to increase the success chances of drug discovery projects has been identified as a central problem by the research community. However, common benchmarks often lack adequate diversity metrics and overlook the impact of the search budget on model performance. We rectify these two shortcomings, by a) offering a diversity-based evaluation of goal-directed generative models using the principled \#Circles metric, and b) evaluating the models under constraints of the number of calls to the scoring functions or the available compute time. Notably, our findings highlight the superior performance of SMILES-based auto-regressive models over graph-based/genetic algorithm counterparts in generating diverse sets of desired compounds.
Submission Number: 64
Loading