Abstract: Fundamental Language Models (FLMs) propose a novel paradigm that separates linguistic competence from factual knowledge to address critical challenges in current language models, including hallucinations, data privacy concerns, and training-induced biases. This paper investigates whether FLMs can maintain robust language processing capabilities while externalizing factual knowledge. Through comprehensive evaluation of linguistic competence across model sizes using specialized benchmarks, we assess lexical, grammatical, and semantic capabilities. We also analyze how model size affects both linguistic and factual knowledge encoding. Our findings demonstrate that linguistic competence stabilizes at relatively modest model sizes, while factual knowledge continues scaling with model size. These results provide empirical support for FLMs as a promising research direction, suggesting that future work could effectively balance language understanding with external knowledge retrieval.
Paper Type: Long
Research Area: Special Theme (conference specific)
Research Area Keywords: retrieval-augmented models, data influence, linguistic theories, reasoning, benchmarking
Languages Studied: English
Submission Number: 2560
Loading