Keywords: Spectral Dynamics, Contrastive Learning, Spurious Correlation
TL;DR: We introduce a regularizer that reshapes the spectral dynamics of contrastive learning to favor diverse, task-relevant features over spurious ones.
Abstract: Contrastive learning methods are widely used to learn general-purpose representations from unlabeled data. However, they often exhibit a bias toward simple, easily learnable features—many of which may be spuriously correlated with downstream labels. This bias can limit performance, particularly for underrepresented or complex concepts. In this work, we study how such spurious correlations influence the spectral dynamics of the learned feature representations—that is, how the eigenspectrum of the feature covariance matrix evolves during training.
We provide empirical and theoretical evidence that spurious features tend to dominate early spectral modes, leading to collapsed or low-rank representations that restrict downstream flexibility. To mitigate this effect, we propose a simple spectral regularization strategy that promotes high-rank representations by flattening the feature spectrum. Our method integrates seamlessly with SimCLR and improves robustness across a range of spurious correlation benchmarks. These findings highlight the importance of spectral diversity for effective self-supervised learning and suggest new directions for improving contrastive objectives.
Student Paper: Yes
Submission Number: 105
Loading