Exploring Graph Tasks with Pure LLMs: A Comprehensive Benchmark and Investigation

Exploring Graph Tasks with Pure LLMs: A Comprehensive Benchmark and Investigation

ACL ARR 2025 May Submission1350 Authors

17 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: In recent years, large language models (LLMs) have emerged as promising candidates for graph tasks. Many studies leverage natural language to describe graphs and apply LLMs for reasoning, yet most focus narrowly on performance benchmarks without fully comparing LLMs to graph learning models or exploring their broader potential. In this work, we present a comprehensive study of LLMs on graph tasks, evaluating both off-the-shelf and instruction-tuned models across a variety of scenarios. Beyond accuracy, we discuss their computational overhead and assess their performance under few-shot/zero-shot settings, domain transfer, structural understanding, and robustness. Our findings show that LLMs, particularly those with instruction tuning, greatly outperform traditional graph models in few-shot settings, exhibit strong domain transferability, and demonstrate excellent generalization and robustness. Our study highlights the broader capabilities of LLMs in graph learning and provides a foundation for future research. Code and datasets are available.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: benchmarking, graph-based methods, transfer learning / domain adaptation, few-shot learning, continual learning, evaluation

Languages Studied: English

Keywords: LLM for Graph, Benchmarking

Submission Number: 1350

Loading