Exploring Chemical Space with Score-based Out-of-distribution GenerationDownload PDF

Published: 06 Mar 2023, Last Modified: 17 Nov 2024ICLR 2023 - MLDD OralReaders: Everyone
Keywords: molecule generation, diffusion models, score-based models, out-of-distribution
TL;DR: We propose a score-based molecular generative framework that aims to generate out-of-distribution molecules beyond the known molecular space and find novel chemical optima of desired properties.
Abstract: A well-known limitation of existing molecular generative models is that the generated molecules highly resemble those in the training set. To generate truly novel molecules that may have even better properties for de novo drug discovery, more powerful exploration in the chemical space is necessary. To this end, we propose Molecular Out-Of-distribution Diffusion (MOOD), a novel score-based diffusion scheme that incorporates out-of-distribution (OOD) control in the generative stochastic differential equation (SDE) with simple control of a hyperparameter, thus requires no additional computational costs. Since some novel molecules may be chemically implausible or may not meet the basic requirements of real-world drugs, MOOD performs conditional generation by utilizing the gradients from a property predictor that guides the reverse-time diffusion process to high-scoring regions according to target properties such as protein-ligand interactions, drug-likeness, and synthesizability. This allows MOOD to search for novel and meaningful molecules rather than generating unseen yet trivial ones. We experimentally validate that MOOD is able to explore the chemical space beyond the training distribution, generating molecules that outscore ones found with existing methods, and even the top 0.01% of the original training pool.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2206.07632/code)
1 Reply

Loading