When Representations Persist but Control Fails: A Mechanistic Analysis of Search in Language Models

When Representations Persist but Control Fails: A Mechanistic Analysis of Search in Language Models

TMLR Paper7016 Authors

14 Jan 2026 (modified: 24 Jan 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Why do language models fail at multi-step reasoning despite encoding task-relevant structure? We investigate this question through graph traversal, uncovering a striking temporal dissociation: models encode graph-theoretic structure with high fidelity (Spearman ρ = 0.50–0.70) yet fail at autonomous multi-step execution (0% accuracy). Critically, control collapse precedes behavioral error—in 78% of failed trials, internal state drift occurs before the first invalid output—while representations persist beyond failure, remaining structurally intact even as execution breaks down. When execution is externalized to a symbolic planner, performance recovers to 50–100%, confirming preserved evaluative competence. Using SearchEval, a diagnostic lens triangulating behavioral traces, representational geometry, and attention dynamics, we localize the bottleneck to attention-based control mechanisms that progressively decouple from task-relevant state during generation. Attention drifts from task-relevant tokens (65%→40%) even when hidden-state geometry remains intact. Neither layer-time nor generation-time computation exhibits the state-tracking signatures required for systematic search. These findings demonstrate that failure arises from control instability rather than representational inadequacy, suggesting that architectural innovations targeting state persistence—not merely scaling—may be necessary for reliable algorithmic reasoning.

Submission Type: Long submission (more than 12 pages of main content)

Previous TMLR Submission Url: https://openreview.net/forum?id=nsyGO59QWR&noteId=nsyGO59QWR

Changes Since Last Submission: The previous submission contained a small number of formatting artifacts introduced during compilation, resulting in unresolved reference placeholders (“??”) in the manuscript. These have now been fully corrected. Specifically, all figure, table, and equation references have been verified to resolve correctly, and the manuscript has been recompiled to ensure there are no broken cross-references or missing artifact links. No substantive changes to the technical content, experiments, or conclusions were made.

Assigned Action Editor: ~Ali_Ramezani-Kebrya1

Submission Number: 7016

Loading