Geometry of Concepts in Next-token Prediction: Neural-Collapse Meets Semantics

Published: 11 Feb 2025, Last Modified: 06 Mar 2025CPAL 2025 (Recent Spotlight Track)EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Models(LLMs), Neural Embeddings, Word Embeddings, Neural-Collapse, Interpretability, Optimization
TL;DR: This work reveals how next-token prediction encodes latent linguistic concepts, linking language data geometry with semantics through SVD, advancing our understanding of semantic learning in LLMs.
Abstract: Modern language models, trained through the conceptually simple next-token prediction (NTP) objective, demonstrate a remarkable ability to capture meaning despite being trained only on explicit (context, next-word) pairs. This raises a fundamental question: How do these models extract and encode latent concepts—such as semantic dichotomies like true/false and male/female, or grammatical distinctions like nouns/verbs—during training? We discover that these latent concepts are inherently encoded in the singular value decomposition of a data sparsity matrix, which captures the support structure of conditional next-word probabilities. While NTP training never explicitly constructs this matrix, the emergent word and context embeddings naturally factor it, thereby capturing linguistic structure. Our results reveal a new form of neural-collapse geometry of latent concepts in NTP that goes beyond traditional geometry of embeddings studied previously in balanced one-hot classification settings. Furthermore, while sharing conceptual similarities with classical distributional semantics, our results reveal how neural models can acquire semantic concepts during training without explicitly constructing co-occurrence matrices.
Submission Number: 63
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview