Examining Large Language Models' form-meaning mappings of information structure constructions in Mandarin Chinese

Published: 18 May 2026, Last Modified: 19 May 2026CoNLL 2026 ArchivalEveryoneRevisionsBibTeXCC BY 4.0
Keywords: information structure constructions, construction grammar, minimal pairs, Mandarin Chinese
Abstract: Construction Grammar (CxG) knowledge in language models has been extensively studied for English, but remains underexplored in other languages. In Mandarin Chinese, the \textit{ba} (把, disposal) and \textit{bei} (被, passive) constructions are widely used for managing information structure. They foreground topical elements (information structure) and encode systematic form-meaning mappings (CxG), particularly with respect to the semantic role of the object. We probe language models' linguistic competence with these constructions using minimal pairs, constructing a new minimal-pair dataset comprising seven paradigms that target both syntactic constraints and verb--construction compatibility. Our results show that it remains a challenge for many models to capture the form-meaning mappings underlying the \textit{ba} construction, although they achieve high accuracy on paradigms driven by surface syntactic cues.
Scope Confirmation: To the best of my judgment, this submission falls within the scope of CoNLL.
Primary Area Selection: Computational Usage-Based Grammars (e.g., Construction Grammars)
Secondary Area Selection: Computational Psycholinguistics, Cognition and Linguistics
Use Of Generative Artificial Intelligence Tools: No, not at all
Data Collection From Human Subjects: No
Submission Type: Archival: I certify that the submission has not been previously published, nor is the material in it under review by another journal or conference. Further, no material in it will be submitted for review at another conference or journal while under review by CoNLL 2026.
Submission Number: 175
Loading