Illusory Generalization in NLP: Why Scaling Laws Mask Systematic Failures in Out-of-Distribution Reasoning
Abstract: Despite the rapid advances in natural language processing (NLP) driven by large-scale neural models, the claim that these models achieve true generalization remains questionable. This paper argues that the current paradigm of scaling laws in which larger models appear to perform better does not equate to genuine conceptual generalization. Instead, models optimize for memorization of latent statistical patterns, leading to overestimated generalization capabilities. We critique existing evaluation methodologies, highlight failures in systematic out-of-distribution (OOD) reasoning, and propose an approach inspired by cognitive science to benchmarking generalization.
Paper Type: Long
Research Area: Special Theme (conference specific)
Research Area Keywords: generalization, benchmarking, cognitive science
Contribution Types: Position papers
Languages Studied: English
Submission Number: 5399
Loading