Track: tiny paper (up to 4 pages)
Keywords: hyperbolic geometry, probing, large language models, chain-of-thought reasoning, hierarchical representations, Poincare embeddings, representational compression, non-Euclidean geometry
TL;DR: Hyperbolic probes robustly recover hierarchical reasoning structure from LLM hidden states where Euclidean probes fail due to late-layer representational compression in reasoning-specialized models.
Abstract: Large language models with chain-of-thought reasoning exhibit hierarchical dependencies, yet the geometric structure of these representations remains underexplored. We probe DeepSeek-R1 (reasoning-specialized) and Qwen2.5 (standard instruction-tuned) on PrOntoQA logical reasoning tasks, comparing Euclidean and hyperbolic probe geometries. Hyperbolic probes maintain robust performance across all layers, while Euclidean probes exhibit late-layer degradation specific to reasoning models--stable at early layers but degrading substantially at the final layer. Standard instruction-tuned models show no such degradation. We further show that probing "thinking tokens"--reasoning-critical tokens identified via linguistic markers--concentrates hierarchical information far more effectively than uniform pooling at the compressed final layer. Layer-wise activation statistics provide statistical evidence linking representational compression to the geometry-dependent performance gap. These findings suggest that hyperbolic geometry provides important robustness advantages for probing reasoning representations, conditional on model architecture.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Presenter: ~Arnav_Raj1
Format: Maybe: the presenting author will attend in person, contingent on other factors that still need to be determined (e.g., visa, funding).
Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding would significantly impact their ability to attend the workshop in person.
Serve As Reviewer: ~Arnav_Raj1
Submission Number: 113
Loading