SToFM: a Multi-scale Foundation Model for Spatial Transcriptomics

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We propose SToFM, a multi-scale foundation model for spatial transcriptomics, which can effectively capture and integrate multi-scale information in spatial transcriptomics data, establishing a comprehensive understanding of biological tissues.
Abstract: Spatial Transcriptomics (ST) technologies provide biologists with rich insights into single-cell biology by preserving spatial context of cells. Building foundational models for ST can significantly enhance the analysis of vast and complex data sources, unlocking new perspectives on the intricacies of biological tissues. However, modeling ST data is inherently challenging due to the need to extract multi-scale information from tissue slices containing vast numbers of cells. This process requires integrating macro-scale tissue morphology, micro-scale cellular microenvironment, and gene-scale gene expression profile. To address this challenge, we propose **SToFM**, a multi-scale **S**patial **T**ranscript**o**mics **F**oundation **M**odel. SToFM first performs multi-scale information extraction on each ST slice, to construct a set of ST sub-slices that aggregate macro-, micro- and gene-scale information. Then an SE(2) Transformer is used to obtain high-quality cell representations from the sub-slices. Additionally, we construct **SToCorpus-88M**, the largest high-resolution spatial transcriptomics corpus for pretraining. SToFM achieves outstanding performance on a variety of downstream tasks, such as tissue region semantic segmentation and cell type annotation, demonstrating its comprehensive understanding of ST data through capturing and integrating multi-scale information.
Lay Summary: Spatial Transcriptomics (ST) technologies provide biologists with rich insights into single-cell biology by preserving spatial context of cells. The form of ST data is tissue slices containing a large number of cells, with each cell containing a high-dimensional gene expression profile. We hope to establish a foundation model for the complex data, which requires integrating macro-scale tissue morphology, micro-scale cellular microenvironment, and gene-scale gene expression profile. To address this challenge, we propose **SToFM**, a multi-scale **S**patial **T**ranscript**o**mics **F**oundation **M**odel, to aggregate information from different scales and obtain high-quality cell representations. Additionally, we construct **SToCorpus-88M**, the largest high-resolution spatial transcriptomics corpus for pretraining. SToFM achieves outstanding performance on a variety of downstream tasks, such as tissue region semantic segmentation and cell type annotation, demonstrating its comprehensive understanding of ST data through capturing and integrating multi-scale information.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Link To Code: https://github.com/PharMolix/SToFM
Primary Area: Applications->Health / Medicine
Keywords: spatial transcriptomics, multi-scale, foundation model
Submission Number: 272
Loading