Topology over biology: network representation improves multi-omics models without need for prior knowledge

15 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: topological deep learning, Mapper algorithm, multi-omics, cancer
TL;DR: We outperform current SOTA techniques for predicting cancer survival using multi-omics data by enhancing the network representation of the data using topological tools.
Abstract: Cancer is a heterogeneous and complex disease with substantial variation in patient outcomes. Multi-omics data (including mRNA expression, DNA methylation and micro-RNA expression) capture transcriptional and post-transcriptional regulation of gene expression within the tumor microenvironment, with the potential to reveal mechanisms responsible for different patient outcomes. However, multi-omics data are complex and high dimensional, and extracting meaningful features through machine learning is a challenging task. Current SOTA techniques involve GNNs based on correlation networks built using omics data, and more recent models introduce improvements by augmenting these correlation networks with known biological interactions and pathways. However, this approach relies on the experimental characterization of biological interactions, which requires significant resources. In this work, we take a different approach by enhancing the representation of the correlation networks using topological tools: the Mapper algorithm for pooling nodes, and topological deep learning to represent higher order interactions. Our novel biology-agnostic models M-SAN and M-HGAT outperform both the naive correlation network approach, and models augmented with prior knowledge, in survival prediction across six cancer types (breast cancer, colon cancer, kidney cancer, melanoma, lung cancer and ovarian cancer) with sample sizes between 149 and 333. Additionally, by examining the most important feature interactions within our models, we find that they have learned gene interactions corresponding to biological processes relevant to cancer proliferation and metastasis.
Supplementary Material: pdf
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 5509
Loading