GeoSSL: Molecular Geometry Self-Supervised Learning with SE(3)-Invariant Denoising Distance MatchingDownload PDF

Anonymous

27 Sept 2022 (modified: 05 May 2023)Submitted to SBM 2022Readers: Everyone
Keywords: molecular geometry, self-supervised learning
TL;DR: We adapt the denoising score matching for the molecular geometric pretraining, with the SOTA performance on downstream tasks.
Abstract: Pretraining molecular representations is critical in a variety of applications for drug and material discovery due to the limited number of labeled molecules, yet most existing work focuses on pretraining on 2D molecular graphs. The power of pretraining on 3D geometric structures, however, has been less explored. This is owning to the difficulty of finding a sufficient proxy task that can empower the pretraining to effectively extract essential features from the geometric structures. Motivated by the dynamic nature of 3D molecules, where the continuous motion of a molecule in the 3D Euclidean space forms a smooth potential energy surface, we propose a 3D coordinate denoising pretraining framework to model such an energy landscape. Leveraging an SE(3)-invariant score matching method, we propose GeoSSL in which the coordinate denoising proxy task is effectively boiled down to the denoising of the pairwise atomic distances in a molecule. Our comprehensive experiments confirm the effectiveness and robustness of our proposed method.
Student Paper: Yes
1 Reply

Loading