Comprehension Without Competence: Architectural Limits of LLMs in Symbolic Computation and Reasoning
Abstract: Large Language Models (LLMs) display striking surface fluency yet systematically fail at tasks requiring symbolic reasoning, arithmetic accuracy, and logical consistency. This paper offers a structural diagnosis of such failures, revealing a persistent gap between \textit{comprehension} and \textit{competence}. Through controlled experiments and architectural analysis, we demonstrate that LLMs often articulate correct principles without reliably applying them—a failure rooted not in knowledge access, but in computational execution. We term this phenomenon the computational \textit{split-brain syndrome}, where instruction and action pathways are geometrically and functionally dissociated. This core limitation recurs across domains, from mathematical operations to relational inferences, and explains why model behavior remains brittle even under idealized prompting. We argue that LLMs function as powerful pattern completion engines, but lack the architectural scaffolding for principled, compositional reasoning. Our findings delineate the boundary of current LLM capabilities and motivate future models with metacognitive control, principle lifting, and structurally grounded execution. This diagnosis also clarifies why mechanistic interpretability findings may reflect training-specific pattern coordination rather than universal computational principles, and why the geometric separation between instruction and execution pathways suggests limitations in neural introspection and mechanistic analysis.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=KMiTUPmGLi
Changes Since Last Submission: Adding video link
--
Dear Editor,
Thank you for your acceptance decision and constructive feedback. We note that this submission represents our fifth revision, which has already incorporated extensive reviewer feedback addressing the clarification of "comprehension without competence" terminology (now explicitly defined as "explanation without execution"), the restructured narrative connecting our three claims, expanded discussion of FFN computational limitations distinguishing hypotheses from empirical evidence, and clarification of ambiguous phrases throughout.
Regarding visualization methods, we tested UMAP as suggested alongside t-SNE; while all methods confirm our geometric separation findings, t-SNE provided the clearest visual representation. The code implementing these dimensionality reduction methods is available in our repository (https://github.com/zzhang-cn/comprehension-without-competence) for verification.
As requested, we have now added citations to related work including Lin et al. (2025), positioning it appropriately as a concurrent work examining LLM limitations from a safety and alignment perspective rather than our mechanistic focus. We have also incorporated additional relevant citations throughout the manuscript to further strengthen our literature review and acknowledge the broader context of research on LLM limitations.
Best regards,
-ZZ
Video: https://youtu.be/Z4waL0GwhyQ?si=sXnrmMb7YYCW95Ry
Code: https://github.com/zzhang-cn/comprehension-without-competence
Assigned Action Editor: ~Gintare_Karolina_Dziugaite1
Submission Number: 5351
Loading