What does it take to predict RTs in Maltese subliminal language processing? Traditional experimental predictors vs. predictors based on distributional semantics
Keywords: subliminal processing, Maltese, post-hoc analysis, predictors, distributional semantics
TL;DR: This paper presents a post-hoc analysis of subliminal processing data comparing traditional experimental predictors vs. predictors taken from distributional semantics (cosine-similarity based on BERT embeddings).
Abstract: Masked priming paradigms have long served as a window into unconscious language processing, revealing how lexical, morphological, and semantic information is accessed during early stages of word recognition [1, 2]. A key debate centers on whether morphological form (shared consonantal roots in Semitic languages) or semantic content is activated first [3]. While some studies argue for form-based decomposition [4], others point to semantic transparency as a prerequisite for priming [5]. This tradeoff appears to be modality-sensitive, with auditory processing more attuned to meaning, and visual processing more driven by orthographic form [6, 7]. In these paradigms, shorter reaction times (RTs) typically indicate facilitation, while longer RTs suggest interference.
Building on this, we examine two masked priming studies in Maltese, a morphologically rich, etymologically hybrid language, to test how well traditional predictors (experimental Condition) versus continuous predictors from distributional semantics (cosine similarity from BERT embeddings) account for reaction times in auditory and visual subliminal priming. Condition captured three levels of prime–target relatedness: Identity (same word), Related (three shared root consonants but not necessarily meaning or etymology), and Unrelated (partial phonological similarity only). Each pair was further classified by Source, reflecting etymological origin: Semitic–Semitic, Non-Semitic–Non-Semitic, or hybrid (Semitic–Non-Semitic / Non-Semitic–Semitic).
We analyzed ~6,000 auditory and ~5,000 visual RTs using linear mixed-effects regression. Word embeddings were derived from mBERTu [8], and cosine similarity was computed for each prime–target pair.
In the auditory modality, categorical Condition predicted RTs best overall (χ²(1) = 12.61, p < .001, condition vs. cosine similarity model), showing robust identity priming effects and significant delays for both related and unrelated pairs. However, cosine similarity still captured meaningful variance, especially in unrelated and hybrid-origin (Semitic-NonSemitic) word pairs, indicating that distributional semantic models can detect subtle semantic facilitation even when morphological cues are unreliable (Figure 1).
In the visual modality, both Condition and cosine similarity significantly predicted RTs, but neither model outperformed the other (χ²(1) ≈ 0, p ≈ 1). Importantly, cosine similarity did not explain variation within the related or unrelated subsets, suggesting that visual masked priming is less sensitive to gradient semantic relationships and more reliant on surface form cues.
Together, these findings demonstrate that distributional semantics can complement, but not fully replace, traditional experimental predictors in psycholinguistic modeling. Their contribution is particularly valuable when explicit morphological structure is ambiguous or deceptive, as in hybrid-origin word pairs. Moreover, the modality contrast supports the view that auditory comprehension engages deeper semantic routes than early visual word recognition. This study underscores the potential of integrating large-language-model embeddings with experimental psycholinguistics to model fine-grained human behavior in language tasks, especially in underexplored, morphologically complex languages like Maltese.
Submission Number: 18
Loading