Abstract: Imaging-based spatial transcriptomics (ST) provides
single-transcript-level spatial resolution for hundreds of
genes, unlike sequencing-based ST technologies whose resolution is limited to physical capture regions (spots) on
slides. Existing methods to identify patterns of interest in
imaging-based ST data are built as extensions of single cell
analysis methods, mostly ignoring valuable spatial information encoded in the raw imaging data. Here we present
a discrete representation learning approach for modeling
spatial gene expression patterns in ST datasets. By employing raw coordinates of detected transcripts and positional
encoding of cell centroids as inputs, we learn discrete representations using Vector Quantized-Variational Autoencoder
(VQ-VAE) to extract multi-scale structures from fluorescence in situ hybridization (FISH) based ST datasets. We
demonstrate the usefulness of discrete representations in
terms of the quality of embedding of ST data as well as improved performance on downstream tasks for extracting biologically meaningful cellular neighborhoods and spatially
variable genes.
Loading