Keywords: ligand generation, drug discovery, geometric learning, 3D structure
TL;DR: We created a machine learning framework to construct ligand molecules inside 3D protein pockets.
Abstract: Computationally-aided design of novel molecules has the potential to accelerate drug discovery. Several recent generative models aimed to create new molecules for specific protein targets. However, a rate limiting step in drug development is molecule optimization, which can take years due to the challenge of optimizing multiple molecular properties at once. We developed a method to solve a specific molecular optimization problem in silico: expanding a small, fragment-like starting molecule bound to a protein pocket into a larger molecule that matches the physiochemical properties of known drugs. Using data-efficient E(3) equivariant based neural networks and a 3D atomic point cloud representation, our model learns how to attach new molecular fragments to a growing structure by recognizing realistic intermediates generated en route to a final ligand. This approach always generates chemically valid molecules and incorporates all relevant 3D spatial information from the protein pocket. This framework produces promising molecules as assessed by multiple properties that address binding affinity, ease of synthesis, and solubility. Overall, we demonstrate the feasibility of 3D molecular structure expansion conditioned on protein pockets and developed a tool that could accelerate the work of medicinal chemists.