A cute horgous meets a scary timfil: how do we interpret novel words in context?

Published: 03 Oct 2025, Last Modified: 13 Nov 2025CPL 2025 TalkEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Pseudowords; Form-meaning systematicity; Free associations; Self-paced Reading; Valence
TL;DR: Do systematic form-meaning mappings observed for isolated pseudowords influence semantic representations for words in context?
Abstract: Background. Studies on form-meaning mappings in the lexicon have highlighted many systematic relations between the surface form of words and what they mean [1, 2, 3, 4]. Recent studies further showed how prime-target similarity and semantic neighbourhood density influence lexical decision latencies also for pseudowords [5, 6], blurring the boundary between words and pseudowords. From a language learning perspective, such a blurred boundary makes sense: every word a person knows has been a pseudoword during development, and many valid words are pseudowords to many speakers, who might encounter them and have to quickly shape a semantic representation. Our work examines the interplay between semantic connotations conveyed by the word form itself [7] and by the sentence context in which the word form is first introduced [8, 9]. We focus on valence, i.e., how positive or negative a stimulus is perceived, building on valence ratings collected for isolated English pseudowords [10]. This abstract details the methodology and analysis plan we will follow. Materials. In our behavioural experiment, each trial will consist of a free association (FA) task and a self-paced reading (SPR) task (order counterbalanced across subjects (see Figure 1)). The key manipulation involves the valence of the target (pseudo)word, as gauged from word [11] and pseudoword [10] ratings, and the valence of the sentence in which they will appear. Our targets consist of 40 pseudowords from [10] and 40 words from [12] (13 negative, 14 neutral, and 13 positive for both, determined from ratings when available and from definitions when not). We then created 40 sentences (13 negative, 14 neutral, 13 positive, determined based on the valence of the content words in the sentence [11]. For each participant, we sample 20 targets (10 words, 10 pseudowords: three positive, four neutral and three negative in each group), embedded in 20 sentences (seven positive, six neutral, and seven negative). (Pseudo)word and sentence valence will be fully crossed. Hypotheses. We hypothesise that negatively-rated pseudowords (e.g., horgous) will elicit more negatively valenced associates than positively-rated pseudowords (e.g., timfil), even when embedded in a sentence with positive valence (and vice versa for positively-rated pseudowords). Moreover, reading times for the words immediately following the target pseudoword are expected to be longer when the pseudoword’s valence conflicts with the sentence’s valence, which would indicate processing costs. Analysis plan. We will use thesauri as well as distributed semantic representations to estimate the i) valence and ii) semantic coherence of associates produced for a (pseudo)word. Then, we will compare how they change between participants who produced associates before seeing the target in context and participants who first saw the pseudoword in context, to study the influence that the sentence exerted on the target (pseudo)word’s associations. Moreover, we will analyse reading times on the word immediately following the target. Associates’ coherence and valence, and RTs will be analysed using mixed models: the key predictors will be (pseudo)word and sentence valence. We will further control for word length, plausibility, orthographic neighbourhood density, and orthographic overlap with the likeliest word to occur in the sentence context. Finally, we will consider the surprisal of the contextualised target (pseudo)words in different Large Language Models to investigate whether these models capture form-meaning mappings in context.
Submission Number: 13
Loading