Decoupling Dynamical Richness from Representation Learning: Towards Practical Measurement

ICLR 2026 Conference Submission13862 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: training dynamics, representation learning, lazy/rich regime, neural collapse, grokking, kernel methods
TL;DR: Practical method to quantify dynamical richness independent of performance.
Abstract: Dynamic feature transformation (the rich regime) does not always align with predictive performance (better representation), yet accuracy is often used as a proxy for richness, limiting analysis of their relationship. We propose a computationally efficient, performance-independent metric of richness grounded in the low-rank bias of rich dynamics, which recovers neural collapse as a special case. The metric is empirically more stable than existing alternatives and captures known lazy-to-rich transitions (e.g., grokking) without relying on accuracy. We further use it to examine how training factors (e.g., learning rate) relate to richness, confirming recognized assumptions and highlighting new observations (e.g., batch normalization promote rich dynamics). An eigendecomposition-based visualization is also introduced to support interpretability, together providing a diagnostic tool for studying the relationship between training factors, dynamics, and representations.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 13862
Loading