Rethinking GNNs and Missing Features: Challenges, Evaluation and a Robust Solution

ICLR 2026 Conference Submission17475 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Missing Features, Graph Neural Networks, Missing Data Mechanisms
Abstract: Handling missing node features is a key challenge for deploying Graph Neural Networks (GNNs) in real-world domains such as healthcare and sensor networks. Existing studies mostly address relatively benign scenarios, namely benchmark datasets with (a) high-dimensional but sparse node features and (b) incomplete data generated under Missing Completely At Random (MCAR) mechanisms. For (a), we theoretically prove that high sparsity substantially limits the information loss caused by missingness, making all models appear robust and preventing a meaningful comparison of their performance. To overcome this limitation, we introduce one synthetic and three real-world datasets with dense, semantically meaningful features. For (b), we move beyond MCAR and design evaluation protocols with more realistic missingness mechanisms. Moreover, we provide a theoretical background to state explicit assumptions on the missingness process and analyze their implications for different methods. Building on this analysis, we propose GNNmim, a simple yet effective approach for node classification with incomplete feature data. Experiments show that GNNmim consistently outperforms specialized architectures across diverse datasets and missingness regimes.
Supplementary Material: zip
Primary Area: learning on graphs and other geometries & topologies
Submission Number: 17475
Loading