Neural Manifold Geometry Encodes Feature Fields

Published: 23 Sept 2025, Last Modified: 29 Oct 2025NeurReps 2025 ProceedingsEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Neural representation, Interpretability, Linear probe, World models
TL;DR: Feature fields generalize from atomic units to intrinsically topological structures of neural representation.
Abstract: Neural networks represent concepts, or ``features'', but the general nature of these representations remains poorly understood. Previous approaches treat features as scalar-valued random variables. However, recent evidence for emergent world models motivates investigating when and how neural networks represent more complex structures. In this work, we formalize and study $\textit{feature fields}$—function-valued features defined over manifolds and other topological spaces corresponding to the underlying world (e.g., value functions, belief distributions). We introduce $\textit{linear field probing}$, a method that extends linear probing to extract feature fields from neural activations. Whereas a linear probe maps scalar features to individual points in activation space, a linear field probe embeds the topological space of a feature field into activation space. We prove that the geometry of this embedding fully defines the space of linearly representable functions for a given feature field. We empirically study feature fields of various topologies using linear field probing and present evidence of their emergence in transformers. This work establishes a formal connection between geometry and representation in neural networks.
Submission Number: 105
Loading