TarDis: Achieving Robust and Structured Disentanglement of Multiple Covariates

Published: 17 Jun 2024, Last Modified: 17 Jul 2024ICML2024-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: covariate disentanglement, generative models, invariant representation learning, latent space optimization, self-supervised learning, single-cell genomics
TL;DR: TarDis is a deep generative model that disentangles multiple covariates in single-cell genomics data, producing structured and interpretable latent spaces for precise biological insights, enhancing data integration and OOD predictions.
Abstract: Addressing challenges in domain invariance within single-cell genomics necessitates innovative strategies to manage the heterogeneity of multi-source datasets while maintaining the integrity of biological signals. We introduce TarDis, a novel deep generative model designed to disentangle intricate covariate structures across diverse biological datasets, distinguishing technical artifacts from true biological variations. By employing tailored covariate-specific loss components and a self-supervised approach, TarDis effectively generates multiple latent space representations that capture each continuous and categorical target covariate separately, along with unexplained variation. Our extensive evaluations demonstrate that TarDis outperforms existing methods in data integration, covariate disentanglement, and robust out-of-distribution predictions. The model's capacity to produce interpretable and structured latent spaces, including ordered latent representations for continuous covariates, enhances its utility in hypothesis-driven research. Consequently, TarDis offers a promising analytical platform for advancing scientific discovery, providing insights into cellular dynamics, and enabling targeted therapeutic interventions.
Submission Number: 82
Loading