Keywords: Optimal Transport, Frank-Wolfe, Cell type deconvolution, Spatial data
Abstract: Single-cell RNA sequencing (scRNA-seq) and spatially-resolved imaging/sequencing technologies are the current cutting edge of transcriptomics data generation in biomedical research. On one hand, scRNA-seq data brings rich high-throughput information spanning the entire transcriptome, sacrificing the structural context of the cells. On the other hand, high-resolution measurements of the spatial context of cells comes with a trade-off in throughput and coverage. Combining data from these two modalities facilitates better understanding of the development and organization of complex tissues, as well as the emerging processes and function of distinct constituent cell types within the tissue. Recent approaches focus only on the expression of genes available in both modalities. They don't incorporate other relevant and available features, especially the spatial context. We propose DOT, a novel optimization framework for assigning cell types to tissue locations, ensuring a high-quality mapping by taking into account relevant but previously neglected features of the data. Our model (i) incorporates ideas from Optimal Transport theory to exploit structural similarities in the data modalities, leveraging not only joint features but also distinct features, i.e. the spatial context, (ii) introduces scale-invariant distance functions to account for differences in the sensitivity of different measurement technologies, (iii) ensures representation of rare cell types using Nash-fairness objectives, and (iv) provides control over the abundance of cell types in the localization. We present a fast implementation based on the Frank-Wolfe algorithm and we demonstrate the effectiveness of DOT on correctly assigning cell types to spatial data coming from (i) the primary motor cortex of the mouse brain, (ii) the primary somatosensory cortex of the mouse brain, and (iii) the developing human heart.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Machine Learning for Sciences (eg biology, physics, health sciences, social sciences, climate/sustainability )
TL;DR: Fast Optimal Transport for robust cell type mapping in high and low resolution spatial data
1 Reply
Loading