Network-Based Clustering of Pan-Cancer Data Accounting for Clinical CovariatesDownload PDF

09 Oct 2022 (modified: 05 May 2023)LMRL 2022 PaperReaders: Everyone
Keywords: Clustering, Graphs, Bayesian Networks, Cancer, Networks
TL;DR: We present a clustering method, that integrates mutational and clinical covariate data in networks of their probabilistic relationships.
Abstract: Identifying subgroups of shared biological properties based on mutational features is a key step towards precision treatment of cancer patients. However, clustering patients based on their mutational profile is challenging due to considerable heterogeneity within and across cancer types. Here, we approach the heterogeneity of cancer by learning probabilistic relationships within pan-cancer data. We present a network-based clustering method, that integrates mutational and clinical covariate data in distinct networks of their probabilistic relationships. To avoid learning the clusters based on covariates such as age and stage, we remove their effect on the cluster assignment, by exploiting causal relationships among the variables. In simulations, we demonstrate that our method outperforms standard clustering methods. We apply our method to a large-scale genomic dataset of 8085 cancer patients, where we identify novel clusters that are predictive of survival beyond clinical information and could serve as biomarkers for targeted treatment.
0 Replies

Loading