PROMPTGNN-SIM: DEEP FUSION AND ALIGNMENT OF GNN AND LLMS FOR TEXT-ATTRIBUTED GRAPH LEARNING

16 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Text-Attributed Graphs, Graph Neural Networks, Large Language Models, Multi-modal Learning, GNN-LLM
TL;DR: We propose a novel framework that overcomes the limitations of shallow GNN-LLM fusion by dynamically guiding LLMs with structural information from GNNs, significantly improving performance on text-attributed graphs.
Abstract: Text-Attributed Graphs (TAGs), which integrate rich textual semantics with graph structural information, play a critical role in graph learning tasks. However, current fusion approaches suffer from a fundamental limitation: they treat textual and structural modalities as separate inputs in a shallow, unidirectional pipeline. This one-way information flow prevents a deep, interactive exchange between modalities, leading to suboptimal performance, particularly in challenging scenarios with sparse connectivity and when generalizing across different graphs. To overcome these limitations, we introduce PromptGNN-sim, a novel bi-directional structure-semantic fusion framework that enables deep, symbiotic collaboration between GNNs and LLMs. At its core, PromptGNN-sim leverages a Graph Attention Network (GAT) to perform semantically-aware neighborhood selection, combining structural attention with textual similarity. This GNN-derived structural context is then used to dynamically generate rich, structure-aware prompts for an LLM, which explicitly include the target node’s textual summary, predicted label, and representative keywords from semantically similar neighbors. Unlike traditional methods, our framework incorporates bi-directional cross-modal contrastive learning and cross-attention mechanisms during training to jointly optimize both GNN and LLM components for enhanced performance and robustness. We conduct comprehensive experiments on six public datasets, including Cora, Pubmed, and WikiCS, evaluating both task performance and robustness under cross-task transfer, cross-dataset generalisation, and sparse perturbations. Results show that PromptGNN-sim significantly outperforms classical GNNs, LLMs, and recent state-of-the-art GNN–LLM fusion methods in terms of accuracy, generalisation, and robustness. This work not only introduces an effective framework for deep GNN–LLM collaboration but also lays a solid foundation for future research on truly interactive multi-modal graph learning.
Supplementary Material: zip
Primary Area: learning on graphs and other geometries & topologies
Submission Number: 7390
Loading