TensorVAE: a simple and efficient generative model for conditional molecular conformation generation

Published: 29 Jan 2024, Last Modified: 29 Jan 2024Accepted by TMLREveryoneRevisionsBibTeX
Abstract: Efficient generation of 3D conformations of a molecule from its 2D graph is a key challenge in in-silico drug discovery. Deep learning (DL) based generative modelling has recently become a potent tool to tackling this challenge. However, many existing DL-based methods are either indirect–leveraging inter-atomic distances or direct–but requiring numerous sampling steps to generate conformations. In this work, we propose a simple model abbreviated TensorVAE capable of generating conformations directly from a 2D molecular graph in a single step. The main novelty of the proposed method is focused on feature engineering. We develop a novel encoding and feature extraction mechanism relying solely on standard convolution operation to generate token-like feature vector for each atom. These feature vectors are then transformed through standard transformer encoders under a conditional Variational Autoencoder framework for generating conformations directly. We show through experiments on two benchmark datasets that with intuitive feature engineering, a relatively simple and standard model can provide promising generative capability outperforming more than a dozen state-of-the-art models employing more sophisticated and specialized generative architecture.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: 1. Figures 5 & 6 revised to improve clarity and legend description 2. Deanonymized
Code: https://github.com/yuh8/TensorVAE
Assigned Action Editor: ~Gabriel_Loaiza-Ganem1
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Number: 1622