Keywords: surprisal, word frequency, creativity, large language models, interpretability, scaling
Abstract: Language Model surprisal is widely used as a proxy for contextual predictability and has recently been reported to correlate with metaphor novelty. However, surprisal is tightly interwined with lexical frequency. We study this interaction on novelty scores of metaphoric words within their context. We analyse measures from 8 Pythia model sizes, and 154 intermediate checkpoints. Across settings, word frequency has stronger associations with novelty than surprisal. Across training stages, surprisal--novelty association peaks at an early stage and then falls again, mirroring a similarly timed increase in surprisal--frequency association. These results suggest that the often-reported optimal LM surprisal settings may incorrectly associate contextual predictability with novelty and processing difficulty.
Paper Type: Short
Research Area: Linguistic theories, Cognitive Modeling and Psycholinguistics
Research Area Keywords: computational psycholinguistics,linguistic theories
Contribution Types: Model analysis & interpretability, Data resources, Data analysis
Languages Studied: English
Submission Number: 10096
Loading