Large Language Models as Topological Thinkers: A Benchmark on Graph Persistent Homology

ICLR 2026 Conference Submission16584 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Persistent Homology, Large Language Models, Graph Reasoning, Topological Data Analysis
Abstract: Persistent homology offers a principled way to capture multi-scale topological structures in graphs, yet it remains unclear whether large language models (LLMs) can understand and reason about such high-order topological concepts. To address this gap, we introduce LLM4PH, the first benchmark designed to evaluate the ability of LLMs to comprehend and apply persistent homology on graphs. Our benchmark decomposes the persistent homology pipeline into four progressively challenging task levels, ranging from simplicial structure understanding to real-world graph inference. It includes 9 sub-tasks spanning 3 synthetic graph sizes and 3 real-world graph datasets, each annotated with topological features such as connected components, simplices, filtrations, and persistence diagrams. We systematically assess LLMs' capabilities in recognizing topological features, reasoning over filtrations, designing filtration strategies, and applying persistent homology for classification. Beyond task-level evaluation, we perform cross-task ablations on prompt encoding and transfer, explore post-training effects, and construct a compositional PH pipeline to assess end-to-end performance. Our results provide the first in-depth view of how well LLMs bridge discrete graph structures with continuous topological abstraction, and offer insights into their potential for structure-aware scientific reasoning.
Primary Area: datasets and benchmarks
Submission Number: 16584
Loading