Where does meaning live? Investigating the synthetic-analytic distinction in LLMs using gender as a case study

ACL ARR 2024 June Submission2616 Authors

15 Jun 2024 (modified: 08 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Some linguistic inferences--e.g., inferring that a square has four sides--seem to follow inherently from what words mean, while others--e.g., inferring that a house has four sides--are considered to follow from "common sense'" or "world knowledge". It has long been debated whether such categorical distinctions, referred to in philosophy as analytic vs.\ synthetic, can be made and what effect they should have on theories and models of semantic meaning. In this paper, we use gender (male vs.\ female) as a case study to explore whether large language models (LLMs) differentiate analytic inferences about gender (e.g., that a woman is female) from synthetic inferences (e.g., that nurses are most often female). We find that, by and large, there are not substantial mechanistic differences, but rather the difference appears to be a matter of degree--i.e., how strongly the inference is encoded and how easily it is overwritten by contextual information. Our study serves as a proof-of-concept for how LLMs can be used to revisit long-standing questions about language representation and processing in general.
Paper Type: Short
Research Area: Semantics: Lexical and Sentence-Level
Research Area Keywords: word embeddings, semantic textual similarity
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 2616
Loading