Keywords: Spurious Correlation, Machine Learning, Deep Learning, Robust Learning
Abstract: Existing research often posits spurious features as "easier" to learn than core features in neural network optimization, but the nuanced impact of their relative simplicity remains under-explored. In this paper, we propose a theoretical framework and associated synthetic dataset grounded in boolean function analysis. Our framework allows for fine-grained control on both the relative complexity (compared to core features) and correlation strength (with respect to the label) of spurious features. Experimentally, we observe that the presence of _stronger_ spurious correlations or _simpler_ spurious features leads to a slower rate of learning for the core features in networks when trained with (stochastic) gradient descent. Perhaps surprisingly, we also observe that spurious features are not forgotten even when the network has _perfectly_ learned the core features. We give theoretical justifications for these observations for the special case of learning with parity features on a one-layer hidden network. Our findings justify the success of retraining the last layer for accelerating core feature convergence and identify limitations of debiasing algorithms that exploit early learning of spurious features. We corroborate our findings through experiments on real-world vision datasets, thereby validating the practical relevance of our framework.
Submission Number: 76
Loading