Keywords: LLM, Text Attributed Graphs, Active Learning, Graph Clustering, LLMs+Graphs, Node Clustering
TL;DR: We use LLMs to obtain feedback for refinining graph clustering solutions on text attributed graphs.
Abstract: Graph clustering on text-attributed graphs (TAGS), i.e., graphs that include natural language text as additional node information, is typically performed using graph neural networks (GNNs), which forego the text in lieu of embeddings. While GNN methods ensure scalability and effectively leverage graph topology, text attributes contain rich information that can be leveraged using large language models (LLMs). However, many real-world applications have limited hardware resources or LLM API call budgets that prevent their naive use. To reconcile these constraints when performing clustering on TAGs, we propose an active learning framework that performs graph clustering using LLM refinment (GCLR) by selectively prompting an imperfect LLM oracle for feedback and, subsequently, finetuning the GNN-based clustering solution to incorporate the feedback. GCLR uses different prompting strategies to improve the LLM's reliability as an oracle and uses noise-controlling fine-tuning to handle this imperfect, but useful feedback. Extensive experiments demonstrate that GCLR can significantly improve clustering performance over state-of-the-art GNN methods.
Supplementary Materials: zip
Submission Type: Extended abstract (max 4 main pages).
Poster: png
Poster Preview: png
Submission Number: 145
Loading