Transcriptomics-Conditioned Virtual Tissue Synthesis via Diffusion Transformers

Published: 28 May 2026, Last Modified: 03 Jun 2026ICML 2026 FM4LS Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: spatial transcriptomics, diffusion models, gene expression, histopathology, conditional generation
TL;DR: STMDiT is a diffusion transformer that fuses a pathology foundation model with a scRNA-seq foundation model to synthesize H&E patches from spatial transcriptomics, with a pseudo-label variant that transfers zero-shot to H&E-only cohorts.
Abstract: Spatial transcriptomics couples hematoxylin and eosin (H&E) tissue morphology with spatially resolved gene expression (GE). However, generative models that exploit this coupling to synthesize tissue images from transcriptomic profiles remain scarce. We present STMDiT (Spatial Transcriptomics and Morphology Diffusion Transformer), a diffusion transformer that synthesizes H&E histopathology patches conditioned jointly on morphological embeddings and transcriptomic profiles. Building on PixCell (Yellapragada et al., 2025), we integrate gene expression from a frozen CancerFoundation encoder (Theus et al., 2024) through adaptive layer normalization and per-block cross-attention, and we train under dual classifier-free guidance with independent modality dropout. On the 10x_TuPro Visium melanoma cohort, GE conditioning improves both image quality over the no-GE PixCell-B baseline (best FID = 252.9 vs 330.7) and transcriptomic fidelity (best AUC = 0.267 vs 0.229, reaching 82% of the real-tile ceiling). Training with DeepSpot’s predicted-transcriptomics pseudo-labels (PTPL) uniquely transfers zero-shot to TCGA_SKCM, an out-of-distribution (OOD) H&E-only melanoma cohort: PTPL-XAttn-PMA-B reaches FID = 690.0, a 57-point improvement over the no-GE baseline (747.1), with a within-model GE-ablation effect of Δ_OOD = +309.5, enabling virtual tissue synthesis beyond native spatial-transcriptomics coverage. Our results indicate that gene-expression conditioning produces morphologically distinct tissue images and supports virtual tissue simulation for hypothesis testing in computational pathology.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 17
Loading