Abstract: Graph Neural Networks (GNNs) have achieved great success in recent years for their remarkable ability to extract effective representations from both node features and graph structures. Most of GNNs only focus on graphs with homogeneous features that correspond to one single feature field. For tabular features that are heterogeneous with multiple feature fields, GNNs often perform less favorably compared to machine learning methods such as boosted trees. In this work, we propose a new perspective to uncover the problem of GNNs on graphs with tabular features through both empirical study and theoretical analysis. The assumption of GNNs that connected nodes exhibit similar patterns can barely hold true for tabular features since multiple feature fields already exhibit different patterns. And propagation on such mismatched graph causes propagated features overcorrelated on graphs, which leads to the reduction of feature diversity and the increase of information redundancy. Therefore, we propose Field-aware Decorrelation Neural Network for graphs with tabular features (GraphFADE), a novel framework that directly optimizes the overcorrelation problem for graphs with tabular features. We first hierarchically partition the dataset into subsets with minimal correlation and then according to the decorrelation clustering results assemble the optimal matched graphs for each feature dimension to propagate on. The empirical study shows that our method achieves superior performance on multiple graphs with tabular features, demonstrating the effectiveness of our model.
Loading