GrapHist: Large-Scale Graph Self-Supervised Learning for Histopathology

GrapHist: Large-Scale Graph Self-Supervised Learning for Histopathology

ICLR 2026 Conference Submission43 Authors

01 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: graph representation learning, digital pathology

Abstract: Self-supervised vision models have achieved notable success in digital pathology. However, their domain-agnostic transformer architectures are not designed to inherently account for fundamental biological elements of histopathology images, namely cells and their complex interactions. In this work, we hypothesize that a biologically-informed modeling of tissues as cell graphs offers a more efficient representation learning. Thus, we introduce GrapHist, a novel graph-based self-supervised framework for histopathology, which learns generalizable and structurally-informed embeddings that enable diverse downstream tasks. GrapHist integrates masked autoencoders and heterophilic graph neural networks that are explicitly designed to capture the heterogeneity of tumor microenvironments. We pre-train GrapHist on a large collection of 11 million cell graphs derived from breast tissues and evaluate its transferability across in- and out-of-domain benchmarks, spanning thorax, colorectal, and skin cancers. Our results show that GrapHist achieves competitive performance compared to its vision-based counterparts, while requiring four times fewer parameters. It also drastically outperforms fully-supervised graph models on cancer subtyping tasks. Finally, to foster further research, we release eight digital pathology graph datasets used in our study, establishing the first large-scale benchmark in this field.

Supplementary Material: zip

Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)

Submission Number: 43

Loading