On the Unreasonable Effectiveness of Feature Propagation in Learning on Graphs with Missing Node Features
Keywords: graph neural networks, missing features, graphs, geometric deep learning
TL;DR: A diffusion-based method to deal with partially missing node features when learning on graphs
Abstract: While Graph Neural Networks (GNNs) have recently become the de facto standard for modeling relational data, they impose a strong assumption on the availability of the node or edge features of the graph. In many real-world applications, however, features are only partially available; for example, in social networks, age and gender are available only for a small subset of users. We present a general approach for handling missing features in graph machine learning applications that is based on minimization of the Dirichlet energy and leads to a diffusion-type differential equation on the graph. The discretization of this equation produces a simple, fast and scalable algorithm which we call Feature Propagation. We experimentally show that the proposed approach outperforms previous methods on seven common node-classification benchmarks and can withstand surprisingly high rates of missing features: on average we observe only around 4% relative accuracy drop when 99% of the features are missing. Moreover, it takes only 10 seconds to run on a graph with ~2.5M nodes and ~23M edges on a single GPU. The code is available at https://github.com/twitter-research/feature-propagation.
PDF File: pdf
Supplementary Materials: zip
Type Of Submission: Full paper proceedings track submission (max 9 main pages).
Type Of Submission: Full paper proceedings track submission.
Software: https://github.com/twitter-research/feature-propagation
Video: https://www.youtube.com/watch?v=xe5A-xQTBdM
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2111.12128/code)
6 Replies
Loading