Towards Computational Comprehension: A Non-Anthropocentric Framework for Evaluating LLM Understanding

ACL ARR 2025 May Submission8083 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: This position paper introduces and motivates Computational Comprehension as an alternative, non-anthropocentric approach to assessing how Large Language Models (LLMs) handle knowledge. Unlike standard benchmarks, which often reward models for surface-level accuracy but shed little light on deeper conceptual understanding, Computational Comprehension directs attention to the model’s internal processes. Specifically, we focus on whether certain neurons or sub-networks remain consistently activated across reformulations and contextual shifts of the same underlying concept. We outline a framework that tests a model’s ability to preserve conceptual invariance under various input transformations, then observes how targeted ablations of relevant sub-networks affect performance. By gauging these internal, concept-related responses rather than relying solely on external metrics, we obtain more fine-grained insights into a model’s capacity to internalize, manipulate, and robustly apply conceptual knowledge. We also propose integrating such analysis into systematic experiments, showing how subtle tweaks to task prompts or data can reveal whether a model is genuinely concept-driven or merely parroting surface correlations. Through Computational Comprehension, we encourage researchers, engineers, and theorists to adopt a deeper, more transparent mode of evaluation—one that foregrounds internal conceptual grounding over score-centric arms races in pursuit of ever-higher benchmark numbers.
Paper Type: Long
Research Area: Linguistic theories, Cognitive Modeling and Psycholinguistics
Research Area Keywords: cognitive modeling, computational psycholinguistics, reflections and critiques, benchmarking
Contribution Types: Position papers, Theory
Languages Studied: English
Submission Number: 8083
Loading