When Does Geometry Emerge from Memorization in Transformers?

13 Apr 2026 (modified: 23 Apr 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Transformer models often show structured internal representations on relational tasks, which are often interpreted as geometric organization. Prior work documents such structure via visualization or performance-based analyses, but does not isolate whether perfect memorization alone yields geometric representations. Here, we conduct a controlled study using synthetic relational worlds defined by canonical graph topologies, explicitly training Transformers to perfectly memorize relational structure without imposing constraints that favor or discourage geometry, and examining whether geometric representations arise as a consequence. Across chains, cycles, regular graphs, and star graphs, models achieve perfect memorization accuracy while internal embeddings do not systematically preserve either global distances or local neighborhoods, indicating reliance on non-geometric, index-based representations. By probing embeddings against shortest-path distance using rank consistency and neighborhood preservation metrics, we show that memorization alone places no requirement on metric organization. Recoverable geometric structure emerges only when the task objective, together with the relational topology, sufficiently constrains node interchangeability, reducing the space of symmetry-equivalent memorization solutions. Our results show that perfect memorization does not imply emergent geometric structure, and characterize the conditions under which structure arises in learned embeddings.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Rémi_Flamary1
Submission Number: 8399
Loading