Can LLMs Effectively Leverage Graph Structural Information: When and Why

Published: 28 Oct 2023, Last Modified: 21 Dec 2023NeurIPS 2023 GLFrontiers Workshop PosterEveryoneRevisionsBibTeX
Keywords: Large Language Model, Graph, Multimodality, Data Leakage, Homophily
TL;DR: We investigate when and why LLMs can effectively leverage the structural information stored in graph-structured data to improve predictive performance.
Abstract: This paper studies Large Language Models (LLMs) augmented with structured data--particularly graphs--a crucial data modality that remains underexplored in the LLM literature. We aim to understand when and why the incorporation of structural information inherent in graph data can improve the prediction performance of LLMs on node classification tasks with textual features. To address the "when" question, we examine a variety of prompting methods for encoding structural information, in settings where textual node features are either rich or scarce. For the "why" questions, we probe into two potential contributing factors to the LLM performance: data leakage and homophily. Our exploration of these questions reveals that (i) LLMs can benefit from structural information, especially when textual node features are scarce; (ii) there is no substantial evidence indicating that the performance of LLMs is significantly attributed to data leakage; and (iii) the performance of LLMs on a target node is strongly positively related to the local homophily ratio of the node.
Submission Number: 66