Abstract: Deep graph neural networks have recently demonstrated powerful representation learning capabilities in bioinformatics. It is still crucial and challenging to design a representation model fusing the fundamental chemistry and biology knowledge. However, most existing representation models not only ignore domain knowledge but are also built on labeled datasets. To address this issue, this paper proposed a hierarchical structure-aware pre-training model that used contrastive learning to improve molecular representations with unlabeled datasets. We conducted comprehensive experiments on 13 molecular benchmark datasets from different application domains. The results demonstrate that our hierarchical structure-aware pre-trained model achieves superior performance against state-of-the-art baselines.
Loading