Evaluating Syntactic Generalization in Mulilingual Language Models through Targeted Test Suites

Published: 03 Oct 2025, Last Modified: 13 Nov 2025CPL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: syntactic evaluation, French, surprisal, language models, morphosyntax, test suites
TL;DR: We evaluate whether language models capture French morphosyntactic rules by measuring surprisal in controlled agreement test suites.
Abstract: This paper outlines a methodology for evaluating the morphosyntactic generalization capabilities of neural language models (LMs), focusing on French as a test case. Building on targeted syntactic evaluation frameworks originally developed for English, we explain how surprisal-based metrics, derived from the probabilities assigned by the model, can be used to probe its internal representations of morphology and syntax. Through carefully designed minimal sentence pairs, we demonstrate how varying features (e.g., in agreement, argument structure, tense and mood) yields measurable predictions about model behavior. We argue that such evaluations allow us to assess the extent to which language models represent morphosyntactic constraints.
Submission Number: 54
Loading