Large Language Model Guided Graph Clustering

Puja Trivedi; Nurendra Choudhary; Edward W Huang; Vassilis N. Ioannidis; Karthik Subbian; Danai Koutra

Large Language Model Guided Graph Clustering

Puja Trivedi, Nurendra Choudhary, Edward W Huang, Vassilis N. Ioannidis, Karthik Subbian, Danai Koutra

Published: 16 Nov 2024, Last Modified: 26 Nov 2024LoG 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM, Text Attributed Graphs, Active Learning, Graph Clustering, LLMs+Graphs, Node Clustering

TL;DR: We use LLMs to obtain feedback for refinining graph clustering solutions on text attributed graphs.

Abstract: Graph clustering on text-attributed graphs (TAGS), i.e., graphs that include natural language text as additional node information, is typically performed using graph neural networks (GNNs), which forego the text in lieu of embeddings. While GNN methods ensure scalability and effectively leverage graph topology, text attributes contain rich information that can be leveraged using large language models (LLMs). However, many real-world applications have limited hardware resources or LLM API call budgets that prevent their naive use. To reconcile these constraints when performing clustering on TAGs, we propose an active learning framework that performs graph clustering using LLM refinment (GCLR) by selectively prompting an imperfect LLM oracle for feedback and, subsequently, finetuning the GNN-based clustering solution to incorporate the feedback. GCLR uses different prompting strategies to improve the LLM's reliability as an oracle and uses noise-controlling fine-tuning to handle this imperfect, but useful feedback. Extensive experiments demonstrate that GCLR can significantly improve clustering performance over state-of-the-art GNN methods.

Supplementary Materials: zip

Submission Type: Extended abstract (max 4 main pages).

Poster: png

Poster Preview: png

Submission Number: 145

Loading