Keywords: Large Language Models(LLMs), Neural Embeddings, Word Embeddings, Neural-Collapse, Interpretability, Optimization
TL;DR: This work reveals how next-token prediction encodes latent linguistic concepts, linking language data geometry with semantics through SVD, advancing our understanding of semantic learning in LLMs.
Abstract: Modern language models, trained through the conceptually simple next-token prediction (NTP) objective, demonstrate a remarkable ability to capture meaning despite being trained only on explicit (context, next-word) pairs. This raises a fundamental question: How do these models extract and encode latent concepts—such as semantic dichotomies like true/false and male/female, or grammatical distinctions like nouns/verbs—during training? We discover that these latent concepts are inherently encoded in the singular value decomposition of a data sparsity matrix, which captures the support structure of conditional next-word probabilities. While NTP training never explicitly constructs this matrix, the emergent word and context embeddings naturally factor it, thereby capturing linguistic structure.
Our results reveal a new form of neural-collapse geometry of latent concepts in NTP that goes beyond traditional geometry of embeddings studied previously in balanced one-hot classification settings.
Furthermore, while sharing conceptual similarities with classical distributional semantics, our results reveal how neural models can acquire semantic concepts during training without explicitly constructing co-occurrence matrices.
Submission Number: 63
Loading