Curriculum GNN-LLM Alignment for Text-Attributed Graphs

ICLR 2025 Conference Submission13351 Authors

28 Sept 2024 (modified: 24 Nov 2024)ICLR 2025 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Graph neural networks, Large language models, Text-attributed graphs, Curriculum learning
TL;DR: We propose a Curriculum GNN-LLM Alignment method for TAGs to strategically balance the learning difficulties of textual and structural information on a node-by-node basis, enhancing the alignment between GNNs and LLMs.
Abstract: Aligning Graph Neural Networks (GNNs) and Large Language Models (LLMs) benefits in leveraging both textual and structural knowledge for Text-attributed Graphs (TAGs) learning, which has attracted an increasing amount of attention in the research community. Most existing literature assumes a uniformly identical level of learning difficulties across texts and structures in TAGs, however, we discover the $\textit{text-structure imbalance}$ problem in real-world TAGs, $\textit{i.e.}$, nodes exhibit various levels of difficulties when learning different textual and structural information. Existing works ignoring these different difficulties may result in under-optimized GNNs and LLMs with over-reliance on either simplistic text or structure, thus failing to conduct node classifications that involve simultaneously learning complex text and structural information for nodes in TAGs. To address this problem, we propose a novel Curriculum GNN-LLM Alignment ($\textbf{CurGL}$) method, which strategically balances the learning difficulties of textual and structural information on a node-by-node basis to enhance the alignment between GNNs and LLMs. Specifically, we first propose a text-structure difficulty measurer to estimate the learning difficulty of both text and structure in a node-wise manner. Then, we propose a class-based node selection strategy to balance the training process via gradually scheduling more nodes. Finally, we propose the curriculum co-play alignment by iteratively promoting useful information from GNNs and LLMs, to progressively enhance both components with balanced textual and structural information. Extensive experiments on real-world datasets demonstrate that our proposed $\textbf{CurGL}$ method is able to outperform state-of-the-art GraphLLM, curriculum learning, as well as GNN baselines. To the best of our knowledge, this is the first study of curriculum alignment on TAGs.
Primary Area: learning on graphs and other geometries & topologies
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 13351
Loading