Abstract: Graph Neural Networks (GNNs) have shown excellent performance in graph-related tasks and have arisen widespread attention. However, most existing works on GNNs mainly focus on proposing a novel GNN model or modifying the models to improve performance in various graph-related tasks and seldom consider possible problems in the graph data as model input data. This paper studies that low-degree nodes, accounting for most of the real-world graphs, naturally have message passing insufficiency problem due to too little information from other nodes when generating node embedding, which will affect the performance of the GNNs. To solve this problem, we propose a simple but practical method-Optimize Graph Then Training (OGT), which adds edges between low-degree nodes and some nodes with the same predicted label based on the GNNs prediction results and the inherent information in the graph. The OGT aims to improve the performance of GNNs in semi-supervised node classification tasks by augmenting the input data. More importantly, the OGT is regarded as a data preprocessing technique and can be used naturally with baseline GNN models (e.g., GCN, GAT, GraphSAGE, and SGC) to improve the performance of these models without making other modifications. Extensive experiments on three benchmark citation datasets with five typical GNN models verify that OGT consistently improves the performance of various GNNs to a great extent, where 1.9% (Cora), 1.0% (Citeseer), and 1.3% (Pubmed) average accuracy improvement on the node classification task.
0 Replies
Loading