Keywords: molecular representation learning
TL;DR: This paper proposes a point-based deep network for molecular representation learning from three-dimensional conformers.
Abstract: Molecular representation learning (MRL) aims to embed molecules into vectors in a high dimensional latent space, which can be used (and reused) for the prediction of various molecular properties. Most current MRL models exploited the SMILES (Simplified Molecular-Input Line-Entry System) strings or molecular graphs as the input format of molecules. As a result, these methods may not capture the full information encoded in the three-dimensional (3D) molecular conformations (also known as the conformers). With mature algorithms for generating 3D molecular conformers, we propose to engage the abundant geometric information in the molecular conformers by representing molecules as point sets, and adapt the point-based deep neural network for MRL. Specifically, we designed an atom-shared elemental operation that extracts features from individual atoms as well as atomic interactions (including covalent bonds and non-covalent interactions), and a mini-network that ensures the representation invariant to rotations and translations of the molecular conformers. We trained the deep neural network (referred to as Mol3DNet) for a variety of tasks of molecular properties prediction using benchmarking datasets. The experimental results demonstrated that Mol3DNet achieves state-of-the-art performance on these classification and regression tasks, except for one task (solubility prediction) where all deep learning models underperform a customized machine learning model.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Machine Learning for Sciences (eg biology, physics, health sciences, social sciences, climate/sustainability )
Supplementary Material: zip
5 Replies
Loading