Multi-modal Spatial Clustering for Spatial Transcriptomics Utilizing High-resolution Histology Images

Published: 2024, Last Modified: 20 Oct 2025BIBM 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Understanding the intricate cellular environment within biological tissues is crucial for uncovering insights into complex biological functions. While single-cell RNA sequencing has significantly enhanced our understanding of cellular states, it lacks the spatial context to fully comprehend the cellular environment. Spatial transcriptomics (ST) addresses this limitation by enabling transcriptome-wide profiling while preserving spatial context. One of the principal challenges in ST data analysis is spatial clustering. Modern ST sequencing procedures typically include a high-resolution histology image, which has been shown in previous studies to be closely connected to gene expression profiles. However, current spatial clustering methods often fail to fully utilize the image information, limiting their ability to capture critical spatial and cellular interactions.In this study, we propose the spatial transcriptomics multimodal clustering (stMMC) model, a novel contrastive learningbased deep learning approach that integrates gene expression data with histology image features through a multi-modal parallel graph autoencoder. We tested stMMC against four state-of-the-art baseline models on two public ST datasets. The experiments demonstrated the superior performance of stMMC in terms of ARI and NMI and an ablation study validated the contributions of key components.
Loading