Exploring the Small World of Word Embeddings: A Comparative Study on Conceptual Spaces from LLMs of Different Scales

Exploring the Small World of Word Embeddings: A Comparative Study on Conceptual Spaces from LLMs of Different Scales

ACL ARR 2025 February Submission5264 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: A conceptual space represents concepts as nodes and semantic relatedness as edges. Word embeddings, paired with a similarity metric, offer an efficient way to construct such a space. Typically, these embeddings come from traditional distributed models or encoder-only pretrained models, as their objectives directly capture the current token’s meaning. In contrast, decoder-only models, including large language models (LLMs), predict the next token, making their embeddings less directly tied to the current token’s semantics. This paper constructs a conceptual space using word embeddings from LLMs and explores its properties. We build a network based on a linguistic typology-inspired connectivity hypothesis, analyze global statistics, and compare LLMs of different scales. Locally, we examine conceptual pairs, WordNet relations, and a cross-lingual semantic network for qualitative words. Our results show that the space exhibits small-world properties, with a high clustering coefficient and short path lengths. Larger LLMs produce more complex spaces, characterized by longer paths and richer relational structures. Additionally, the network serves as an efficient agent for cross-lingual semantic maps.

Paper Type: Long

Research Area: Linguistic theories, Cognitive Modeling and Psycholinguistics

Research Area Keywords: conceptual spaces, large language models, word embeddings, small world

Contribution Types: Model analysis & interpretability

Languages Studied: mainly English and also involves multiple languages

Submission Number: 5264

Loading