Keywords: Computational Biology, Invariant Representation Learning, Generative Models
TL;DR: A method that learns invariant representations across healthy and disease cells to infer factors of disease influence in scRNA-seq data.
Abstract: A core challenge in computational biology is predicting the effects of disease on healthy tissue. From the machine learning perspective, effects of disease and other stimulations on gene expression of single cells can be modeled as a domain shift in a low-dimensional latent space applied to healthy cells. Guided by principles of domain-invariance and compositional models, we present "single-cell Domain Shift Autoencoder (scDSA)", a deep generative model for disentangling disease-invariant and disease-specific gene programs at single-cell resolution. scDSA uncovers latent factors that are conserved across healthy and disease cell states, and learns how these factors interact with disease. We show that our model i) predicts counterfactual healthy cell-types of diseased cells in unseen patients, ii) captures interpretable representations of disease(s), and iii) learns interaction of disease effects and cell-types. scDSA helps to further our understanding of how diseases perturb healthy tissue on a patient-specific basis therefore enabling advances in personalized healthcare.
Submission Number: 24
Loading