Quantization Enhanced Cross-modal Alignment for Gene Expression Prediction

Zirui Zhu; Yongbing Zhang

Quantization Enhanced Cross-modal Alignment for Gene Expression Prediction

Zirui Zhu, Yongbing Zhang

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Gene Expression Prediction, Cross-modal Alignment

TL;DR: Quantization Enhanced Cross-modal Alignment for Gene Expression Prediction

Abstract: In modern healthcare, whole-slide histological images (WSIs) provide information on tissue structure and composition at the microscopic level. Integrating WSIs and gene expression profiles enhances cancer diagnosis and treatment planning, advancing clinical care and research. However, spatial transcriptomics is costly and requires a long sampling time. The intrinsic correlation between histological images and gene expressions offers the potential for predicting spatial transcriptomics using Hematoxylin-Eosin (H\&E) stained WSIs to reduce time and resource costs. Although existing methods have achieved impressive results, they ignore the heterogeneity between modalities of image and gene expression. In this paper, we propose a Quantized Cross-modal Alignment (QCA) that exploits cross-modal interactions to address the issue of modal heterogeneity. Considering the interference of gene-unrelated image features, we develop a Gene-related Image Feature Quantizer (GIFQ) to capture the gene-related image features. Meanwhile, we develop an Asymmetric Cross-modal Alignment (ACA) approach, which facilitates the model to generate discriminative predictions from similar visual presentations. In addition, to fix the discriminability reduction, a Discriminability-Enhancing Regularization (DER) is further devised to regularize both the virtual and real gene features. Experimental results on a breast cancer dataset sampled by solid-phase transcriptome capture elucidate that our QCA model achieves state-of-the-art results for accurate prognostication of gene expression profiles, increasing the performance by 13\% at least. Our method utilizes deep learning technology to delineate the correlation between morphological features and gene expression, furnishing new perspectives and instruments for disclosing biomarkers in histological conditions. The code will be released.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 10013

Loading