Abstract: Molecular representation learning, which aims to automate feature learning for molecules, is a vital task in computational chemistry and drug discovery. Despite rapid advances in molecular pretraining models with various types of featurizations, from SMILES strings, 2D graphs to 3D geometry, there is a paucity of research on how to utilize different molecular featurization techniques to obtain better representations. To bridge that gap, we present a novel multiview contrastive learning approach dubbed MEMO in this paper. Our pretraining framework, in particular, is capable of learning from four basic but nontrivial featurizations of molecules and adaptively learning to optimize the combinations of featurization techniques for different downstream tasks. Extensive experiments on a broad range of molecular property prediction benchmarks show that our MEMO outperforms state-of-the-art baselines and also yields reasonable an interpretation of molecular featurizations weights in accordance with chemical knowledge.
Track: Original Research Track