TacoGFN: Target Conditioned GFlowNet for Drug Design

Tony Shen; Mohit Pandey; Martin Ester

TacoGFN: Target Conditioned GFlowNet for Drug Design

Tony Shen, Mohit Pandey, Martin Ester

Published: 27 Oct 2023, Last Modified: 22 Nov 2023GenBio@NeurIPS2023 SpotlightEveryoneRevisionsBibTeX

Keywords: Drug Discovery, Molecule Generation, Conditional Generation, Generative Flow Networks, Multi-objective Optimization

TL;DR: RL approach for pocket conditioned drug design that outperforms existing methods across all molecular properties while significantly reducing runtime.

Abstract: We seek to automate the generation of drug-like compounds conditioned to specific protein pocket targets. Most current methods approximate the protein-molecule distribution of a finite dataset and, therefore struggle to generate molecules with significant binding improvement over the training dataset. We instead frame the pocket-conditioned molecular generation task as an RL problem and develop TacoGFN, a target conditional Generative Flow Networks model. Our method is explicitly encouraged to generate molecules with desired properties as opposed to fitting on a pre-existing data distribution. To this end, we develop transformer-based docking score prediction to speed up docking score computation and propose TacoGFN to explore molecule space efficiently. Furthermore, we incorporate several rounds of active learning where generated samples are queried using a docking oracle to improve the docking score prediction. This approach allows us to accurately explore as much of the molecule landscape as we can afford computationally. Empirically, molecules generated using TacoGFN and its variants significantly outperform all baseline methods across every property (Docking score, QED, SA, Lipinski), while being orders of magnitude faster.

Submission Number: 22

Loading