GraphGhosts: Tracing Reasoning Structures Behind Large Language Models

19 Sept 2025 (modified: 25 Sept 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM reasoning; Interpretation; Graph
Abstract: Large Language Models (LLMs) exhibit remarkable reasoning and generalization abilities, yet the mechanisms underlying these abilities remain poorly understood. Recent studies have introduced circuit-tracing methods to explain individual token predictions. However, the question of why LLMs can perform complex reasoning tasks remains largely unexplored. Motivated by the inherent logical structures in reasoning tasks, we hypothesize that LLMs rely on latent graph-like structures, which we term GraphGhosts, to guide their reasoning processes. GraphGhosts manifest at three distinct levels: IntraSample Graphs, which capture token-to-token dependencies within individual reasoning instances; InterToken Graphs, which reveal dataset-level wiring patterns across tokens; and Semantic Graphs, which encode cross-domain conceptual understanding. To validate their significance, we conduct perturbation experiments on these graphs, showing that even small structural modifications can drastically alter a model’s reasoning ability.
Primary Area: interpretability and explainable AI
Code Of Ethics: true
Submission Guidelines: true
Anonymous Url: true
No Acknowledgement Section: true
Submission Number: 20241
Loading