Graph as New Language: LLM-Based Graph Learning with Node-to-Center Path Sequences as Training Corpus

Graph as New Language: LLM-Based Graph Learning with Node-to-Center Path Sequences as Training Corpus

ACL ARR 2025 May Submission298 Authors

10 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Graph learning is widely encountered in the real-world applications. Existing approaches typically combine graph neural networks with NLP methods, recently with large language models (LLMs), to encode node texts. However, this two-stage paradigm suffers from a suboptimal alignment between textual and structural features. Since LLMs are probabilistic models excelling at next-word prediction, not inherently designed for graphs, we propose a new perspective that treats graphs as a new language, enabling language models to predict node sequences by learning from graph structure. Unlike natural language with existing coherent and abundant corpora, graphs fail to provide structured and meaningful node orders inherently, making the corpus construction with high-quality node sequences challenging. To address this problem, we design PathGLM (Path-based Graph Language Model), which first builds the community-centric corpus that constrains path selection within community scope. Next, we extract structurally node-to-center paths fed into LLMs to learn the graph language grammar, also serving as prefixes in fine-tuning. Experimental results illustrate that PathGLM improves semantic-structure integration and achieves state-of-the-art performance.

Paper Type: Long

Research Area: Information Retrieval and Text Mining

Research Area Keywords: graph learning, text attributed graphs, large language models, graph language, structure-semantic integration

Contribution Types: Model analysis & interpretability

Languages Studied: English

Keywords: graph learning, text attributed graphs, large language models, graph language, structure-semantic integration

Submission Number: 298

Loading