Keywords: Abstract Reasoning, AI, NeuroAI, Large Language Models (LLMs), Electroencephalography (EEG), Fixation Related Potentials (FRPs), Representational Similarity Analysis (RSA)
TL;DR: Large language models (LLMs), particularly those with approximately 70 billion parameters, show human-like performance and internal representational alignment with human brain activity (EEG) during abstract-pattern-completion tasks
Abstract: This study investigates whether large language models (LLMs) mirror human neurocognition during abstract reasoning. We compared the performance and neural representations of human participants with those of eight open-source LLMs on an abstract-pattern-completion task. We leveraged pattern type differences in task performance and in fixation-related potentials (FRPs) as recorded by electroencephalography (EEG) during the task. Our findings indicate that only the largest tested LLMs (~70 billion parameters) achieve human-comparable accuracy, with Qwen-2.5-72B and DeepSeek-R1-70B also showing similarities with the human pattern-specific difficulty profile. Critically, every LLM tested forms representations that distinctly cluster the abstract pattern categories within their intermediate layers, although the strength of this clustering scales with their performance on the task. Moderate positive correlations were observed between the representational geometries of task-optimal LLM layers and human frontal FRPs. These results consistently diverged from comparisons with other EEG measures (response-locked ERPs and resting EEG), suggesting a potential shared representational space for abstract patterns. This indicates that LLMs might mirror human brain mechanisms in abstract reasoning, offering preliminary evidence of shared principles between biological and artificial intelligence.
Paper Type: New Full Paper
Supplementary Material: pdf
Submission Number: 51
Loading