Keywords: two-layer neural network, feature learning, metric learning, local elasticity, retrieval, random matrix theory
TL;DR: We extend the Conjugate Kernel framework to identify conditions on unseen data distributions that induce local elasticity and clustering, providing a unified theory for feature learning and metric learning in the proportional regime.
Abstract: Investigating phenomena such as Alignment and Local Elasticity is essential for understanding feature space of Neural Networks and enhancing performance across a wide range of tasks.
In this context, we investigate the emergence of these phenomena in two-layer neural networks performing a classification task.
This paper reveals Alignment and Local Elasticity emergence condition after one step of training are identical.
In particular, we demonstrate that intra-class features are more aligned when the inner product of their mean and the covariance of the training data-label \ie \textit{train-unseen similarity} is large, with stronger Local Elasticity occurring under this condition.
We validate our theory through experiments with a two-layer network showing that both Alignment and Local Elasticity improve as the train-unseen similarity increases.
Furthermore, we claim that our analysis provides both theoretical and practical insights into the relationship between train-unseen similarity, alignment, and the improvement of clustering performance on unseen data for neural networks trained on similar domain data. This is supported by experiments, including a multi-layer CNN setup and detailed discussions.
Specifically, we show that higher train-unseen similarity improves Recall@1 in two-layer networks and that Alignment and Recall@1 exhibit a positive correlation in metric learning.
We also present novel techniques for deriving operator norm bounds of non-centered Sub-Gaussian matrices, extending conventional regression analysis with standard Gaussian assumptions to the binary classification setting.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4780
Loading