Generalization through Lexical Abstraction in Transformer Models: the Case of Functional Words

ACL ARR 2025 February Submission2347 Authors

14 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: "The researchers wrote the paper" and "They wrote it" share syntactic and semantic information that is easily recognizable for humans. Specifically, the latter is an abstraction of the former. Can language models also recognize the syntactic and semantic parallelism of the two sentences, which relies on lexical abstraction? We present a study that aims to uncover whether a language model encodes words and sentences in a way that reflects this linguistic abstraction. We compare representations of nouns, on one side, and the pronouns and adverbs (functional words) that can replace these nouns, as well as the corresponding lexicalized and functional sentences, on the other. The shallow analyses show that nouns and functional words inhabit different areas of the embedding space, both when considered in isolation or in the same sentential contexts. Deeper analyses, however, show that the structure shared between the lexicalized sentences and their functional variations is encoded and can be uncovered from their embeddings. Our results then indicate that, when properly constrained by the structure, the information supporting the generalization through abstraction provided by pronouns and functional words can be revealed.
Paper Type: Long
Research Area: Special Theme (conference specific)
Research Area Keywords: linguistic abstraction, functional words, shared structure
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 2347
Loading