Sparse Spectral Signatures of Reasoning: Model-Agnostic Verification via Sentence- Level Graph Signals
Track: long paper (up to 10 pages)
Keywords: spectral graph theory, chain-of-thought verification, reasoning verification, graph signal processing, model-agnostic evaluation, LLM reasoning
TL;DR: Spectral metrics from sentence-level semantic graphs built solely from chain-of-thought text discriminate correct from incorrect LLM reasoning across domains and models—including closed-source ones—without model internals
Abstract: Recent work has shown that spectral properties of internal attention graphs
can distinguish valid from invalid mathematical reasoning in LLMs. How-
ever, attention-based methods require access to model weights, exclud-
ing closed-source models and production deployments. We investigate
whether analogous spectral signatures exist in external sentence-level se-
mantic graphs constructed solely from chain-of-thought text. We construct
cosine-similarity threshold graphs over sentence embeddings and compute
spectral metrics from the graph Laplacian—requiring only black-box text
output. Across 2,400 traces spanning three reasoning domains (mathemat-
ical, first-order logic, deductive) and four model architectures—including
the closed-source Claude Sonnet 4—we find that spectral metrics reliably
discriminate correct from incorrect reasoning, with 9 of 12 domain-model
conditions significant at p<0.05 (AUC up to 0.77). Spectral features add up
to +14.9% AUC over text-level baselines, with the largest gains when base-
lines are weakest—demonstrating that spectral analysis captures structural
reasoning properties orthogonal to surface text quality.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding would significantly impact their ability to attend the workshop in person.
Submission Number: 74
Loading