NUMBER REPRESENTATIONS IN LLMS: A COMPUTATIONAL PARALLEL TO HUMAN PERCEPTION

16 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Natural Logarithmic, Number line, LLM, representations, embeddings
Abstract: We provide empirical evidence that large language models (LLMs) encode numerical values on a compressed, logarithmic number line, challenging the prevailing assumption of linear representation. Extracting hidden states for numerals, we project them onto one-dimensional manifolds using dimensionality reduction and evaluate two complementary metrics: Spearman’s $\rho$ to measure monotonicity and a newly introduced Scaling Rate Index ($\beta$) that quantifies whether spacing is sublinear, linear, or superlinear. Across several LLM families, unsupervised projection into low-dimensional subspaces of maximum variance consistently uncovers strong sublinear trends. Interventions along the discovered number-line directions causally modulate next-number predictions, demonstrating that these dimensions encode genuine numerical structure. This compressed geometry is robustly observed in controlled log-spaced prompts and in real-world settings such as birth years, but is absent in non-numerical controls. Our findings refine the linear representation hypothesis by showing that numerical magnitudes occupy a structured subspace whose internal geometry is systematically non-uniform and logarithmic in nature.
Primary Area: interpretability and explainable AI
Submission Number: 7044
Loading