Cross-modal Graph Contrastive Learning with Cellular Images

Shuangjia Zheng; Jiahua Rao; Jixian Zhang; Cohen Ethan; Chengtao Li; Yuedong Yang

Cross-modal Graph Contrastive Learning with Cellular Images

Shuangjia Zheng, Jiahua Rao, Jixian Zhang, Cohen Ethan, Chengtao Li, Yuedong Yang

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone

Keywords: Celluar Image, Drug discovery, Graph neural networks, Self-supervised learning, Cross-modal learning

TL;DR: We propose using high-content cell images to assist learning molecular representation in a cross-modal framwork.

Abstract: Constructing discriminative representations of molecules lies at the core of a number of domains such as drug discovery, material science, and chemistry. State-of-the-art methods employ graph neural networks (GNNs) and self-supervised learning (SSL) to learn the structural representations from unlabeled data, which can then be fine-tuned for downstream tasks. Albeit powerful, these methods that are pre-trained solely on molecular structures cannot generalize well to the tasks involved in intricate biological processes. To cope with this challenge, we propose using high-content cell microscopy images to assist in learning molecular representation. The fundamental rationale of our method is to leverage the correspondence between molecular topological structures and the caused perturbations at the phenotypic level. By including cross-modal pre-training with different types of contrastive loss functions in a unified framework, our model can efficiently learn generic and informative representations from cellular images, which are complementary to molecular structures. Empirical experiments demonstrated that the model transfers non-trivially to a variety of downstream tasks and is often competitive with the existing SSL baselines, e.g., a 15.4\% absolute Hit@10 gains in graph-image retrieval task and a 4.0\% absolute AUC improvements in clinical outcome predictions. Further zero-shot case studies show the potential of the approach to be applied to real-world drug discovery.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Applications (eg, speech processing, computer vision, NLP)

Supplementary Material: zip

17 Replies

Loading