Bridging the Von Neuman Gap: Why LLMs Haven’t Made Novel Discoveries

Published: 11 Nov 2025, Last Modified: 23 Dec 2025XAI4Science Workshop 2026EveryoneRevisionsBibTeXCC BY 4.0
Track: Tiny Paper Track (Page limit: 3-5 pages)
Keywords: World Models, Cognitive Science, Large Language Models, Mechanistic Interpretability, Embodied Learning
TL;DR: LLMs struggle to achieve genuine scientific discovery because, unlike humans, they lack embodied experience and the analogical, schema-based reasoning needed to form deep causal world models
Abstract: Large language models (LLMs) have been trained on vast data spanning nearly every scientific discipline, yet they rarely produce meaningful novel discovery. Human polymaths such as John von Neumann routinely generated breakthroughs across disparate fields---from game theory to quantum mechanics to the very architecture of the modern computer---by connecting insights across domains. We argue this gap reflects a structural limitation of the LLM paradigm rather than a problem of scale. Drawing on Piaget’s theory of cognitive development and Gentner's structure-mapping, we contend novel discovery depends on two core processes: constructing nuanced internal schemas of the external world and flexibly redeploying them via analogical mapping. Without embodied data or exploration, LLMs form shallow world models; and because their architectures optimize for statistical efficiency, they struggle to extend analogies out of distribution in ways that capture relational structure across domains. Without rethinking training environments and architectures, LLMs will remain constrained to weak abstraction rather than the deep reasoning required for scientific innovation.
Submission Number: 17
Loading