Keywords: debias, spurious correlation, mixup
Abstract: Neural networks trained with ERM (empirical risk minimization) sometimes learn unintended decision rules, in particular when their training data is biased, i.e., when training labels are correlated with undesirable features. Techniques have been proposed to prevent a network from learning such features, using the heuristic that spurious correlations are ``simple'' and learned preferentially during training by SGD. Recent methods resample or augment training data such that examples displaying spurious correlations (a.k.a. bias-aligned examples) become a minority, whereas the other, bias-conflicting examples become prevalent. These approaches are difficult to train and scale to real-world data, e.g., because they rely on disentangled representations. We propose an alternative based on mixup that augments the bias-conflicting training data with convex combinations of existing examples and their labels. Our method, named SelecMix, applies mixup to selected pairs of examples, which show either (i)~the same label but dissimilar biased features, or (ii)~a different label but similar biased features. To compare examples with respect to the biased features, we use an auxiliary model relying on the heuristic that biased features are learned preferentially during training by SGD. On semi-synthetic benchmarks where this heuristic is valid, we obtain results superior to existing methods, in particular in the presence of label noise that makes the identification of bias-conflicting examples challenging.