Geometry of Reason: Probabilistic Spectral Verification for Mathematical Reasoning

Valentin NOËL

Geometry of Reason: Probabilistic Spectral Verification for Mathematical Reasoning

Valentin NOËL

Published: 02 Mar 2026, Last Modified: 24 Mar 2026ICLR 2026 Workshop VerifAI-2EveryoneRevisionsBibTeXCC BY 4.0

Track: long paper (up to 8 pages)

Keywords: Mathematical Reasoning, Formal Verification, Attention Topology, Platonic Validity, Automated Theorem Proving, Mechanistic Interpretability, High-Frequency Energy Ratio (HFER), Hallucination Detection, AI Safety, Reward Shaping.

TL;DR: A training-free method to verify AI math proofs by measuring if the internal attention is "smooth" (valid) or "noisy" (invalid) using graph math.

Abstract: Formal verification of mathematical reasoning in large language models (LLMs) often faces a binary bottleneck: proofs are frequently rejected by compilers due to timeouts or syntactic idiosyncrasies rather than logical failure. We propose a probabilistic, training-free alternative that provides "soft assurances" of reasoning integrity: spectral analysis of attention topology. By treating attention matrices as dynamic graphs, we extract four interpretable spectral diagnostics, Fiedler value, High-Frequency Energy Ratio (HFER), spectral entropy, and graph smoothness, that differentiate logically coherent trajectories from hallucinations without learned parameters. Across seven models (Llama, Qwen, Phi, Mistral), our method yields effect sizes up to Cohen's $d = 3.30$ ($p < 10^{-116}$), achieving $85$--$96\%$ accuracy. Crucially, we discover that spectral analysis tracks "Platonic validity", identifying mathematically sound proofs that formal verifiers reject, offering a robust signal for guiding search processes in automated theorem proving. Causal ablation studies confirm this signature reflects the functional health of induction circuits, establishing a mechanistic basis for the method. We further demonstrate that attention architecture (e.g., Sliding Window Attention) deterministically shifts the discriminative signal, and the method generalizes to informal chain-of-thought ($d = 0.78$). These findings position spectral topology as a principled framework for hybrid verification, with immediate applications as an expressive reward signal for reinforcement learning and real-time safety monitoring in agentic deployments.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Submission Number: 50

Loading