How Cross-Entropy Shapes Representation Geometry: A Spectral Study on Cycle Graphs

Published: 26 May 2026, Last Modified: 26 May 2026ICML 2026 FoGen Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Embedding geometry, Softmax cross-entropy, Sparsity, Spectral analysis
Abstract: Why do embeddings trained on graph transition matrices remember geometry? On cycles, we show that this phenomenon is neither surprising nor mysterious: it admits a transparent spectral explanation in terms of the global optima and implicit bias of softmax cross-entropy. We make this mechanism explicit by characterizing the optimal embedding geometry of tied and untied parameterizations trained on cycle graphs, under both sparse targets and dense label-smoothed targets. For sparse targets, tied embeddings converge to a finite rank-2 solution whose embeddings recover the cycle structure, whereas untied embeddings diverge along a max-margin direction whose rank scales with the number of nodes and whose spectrum differs from that of the target. With label smoothing, untied embeddings recover the target structure exactly up to scaling, while tied embeddings approximate its positive semidefinite component. Together, our theoretical and numerical results show that geometric memory on cycles is a consequence of how cross-entropy transforms the spectral structure of the target distribution into representation geometry through its implicit optimization bias.
Submission Number: 198
Loading