On the Dynamics & Transferability of Latent Generalization during Memorization

16 Sept 2025 (modified: 20 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Memorization, Generalization, Deep Neural Network
Abstract: Deep Networks have been known to have extraordinary generalization abilities, via mechanisms that aren't yet well understood. It is also known that upon shuffling labels in the training data to varying degrees, Deep Networks, trained with standard methods, can still achieve perfect or high accuracy on this corrupted training data. This phenomenon is called memorization, and typically comes at the cost of poorer generalization to true labels. Recent work has demonstrated, surprisingly, that such networks retain significantly better latent generalization abilities, which can be recovered via simple probes on their layer-wise representations. However, the origin and dynamics over training of this latent generalization is not well understood. Here, we track the training dynamics, empirically, and find that latent generalization abilities largely peak early in training, with model generalization, suggesting a common origin for both. However, while model generalization degrades steeply over training thereafter, latent generalization falls more modestly & plateaus at a higher level over epochs of training. Next, we design a new linear probe, in contrast with the quadratic probe used in prior work, and demonstrate that it has superior generalization performance in comparison to the quadratic probe, in most cases. Importantly, using the linear probe, we devise a way to transfer the latent generalization present in last-layer representations to the model by directly modifying the model weights. This immediately endows such models with improved generalization, i.e. without additional training. Finally, we use the linear probe to design initializations for Deep Networks, which, in many cases, turn out to be memorization-resistant, without using regularization. That is, Deep Networks with such initializations tend to evade memorization of corrupted labels, which is often accompanied by better generalization, when used with standard training methods alone. Our findings provide a more detailed account of the rich dynamics of latent generalization during memorization, and demonstrate the means to leverage this understanding to directly transfer this generalization to the model & design better model-weight initializations in the memorization regime.
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 7113
Loading