Understanding and addressing spurious correlation via Neural Tangent Kernels: A spectral bias perspective
The existence of spurious correlations can prompt neural networks to depend heavily on features that exhibit strong correlations with the target labels exclusively in the training set, while such correlations may not persist in real-world scenarios. As a consequence, this results in suboptimal performance within certain subgrouping of the data. In this work, we leverage the theoretical insights of the Neural Tangent Kernel (NTK) to investigate the group robustness problem in the presence of spurious correlations. Specifically, we identify that poor generalization is not solely a consequence of statistical biases inherent in the dataset; rather, it also arises from the disparity in complexity between spurious and core features. Building upon this observation, we propose a method that adjusts the spectral properties of neural networks to mitigate bias without requiring knowledge of the spurious attributes.