When Covariate-shifted Data Augmentation Increases Test Error And How to Fix It

Anonymous

Sep 25, 2019 ICLR 2020 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: Empirically, data augmentation sometimes improves and sometimes hurts test error, even when only adding points with labels from the true conditional distribution that the hypothesis class is expressive enough to fit. In this paper, we provide precise conditions under which data augmentation hurts test accuracy for minimum norm estimators in linear regression. To mitigate the failure modes of augmentation, we introduce X-regularization, which uses unlabeled data to regularize the parameters towards the non-augmented estimate. We prove that our new estimator never hurts test error and exhibits significant improvements over adversarial data augmentation on CIFAR-10.
  • Keywords: data augmentation, adversarial training, interpolation, overparameterized
0 Replies

Loading