Lift Your Molecules: Molecular Graph Generation in Latent Euclidean Space

Published: 17 Jun 2024, Last Modified: 13 Jul 2024ICML 2024 Workshop GRaMEveryoneRevisionsBibTeXCC BY 4.0
Track: Extended abstract
Keywords: Drug Design, Computational Biology, Molecule Generation, Graph Generation, Latent Diffusion Models
TL;DR: We propose a latent Euclidean space for molecular graph generation and demonstrate that a diffusion model on such a space achieves state-of-the-art performance on common molecular graph generation benchmarks.
Abstract: We introduce a new framework for molecular graph generation using 3D molecular generative models. Our Synthetic Coordinate Embedding (SyCo) framework maps molecular graphs to Euclidean point clouds via synthetic conformer coordinates and learns the inverse map using an E($n$)-Equivariant Graph Neural Network (EGNN). The induced point cloud-structured latent space is well-suited to apply existing 3D molecular generative models. This approach simplifies the graph generation problem - without relying on molecular fragments nor autoregressive decoding - into a point cloud generation problem followed by node and edge classification tasks. As a concrete implementation of our framework, we develop EDM-SyCo based on the E(3) Equivariant Diffusion Model (EDM). It achieves state-of-the-art performance in distribution learning of molecular graphs, outperforming the best non-autoregressive methods by more than 30\% on ZINC250K and 16\% on GuacaMol while improving conditional generation by up to 3.9 times.
Submission Number: 25
Loading