Understanding the Detrimental Class-level Effects of Data Augmentation

ICML 2023 Workshop SCIS Submission82 Authors

Published: 20 Jun 2023, Last Modified: 28 Jul 2023SCIS 2023 PosterEveryoneRevisions
Keywords: data augmentation, class-dependent bias
Abstract: Data augmentation (DA) encodes invariance and provides implicit regularization critical to a model's performance in image classification tasks. However, while DA improves average accuracy, recent studies have shown that its impact can be highly class dependent: achieving optimal average accuracy comes at the cost of significantly hurting individual class accuracy by as much as $20\%$ on ImageNet. In this work, we present a framework for understanding how DA interacts with class-level learning dynamics. Using higher-quality multi-label annotations on ImageNet, we systematically categorize the affected classes and find that the majority are inherently ambiguous, spuriously correlated, or involve fine-grained distinctions, while DA controls the model's bias towards one of the closely related classes. While many of the previously reported performance drops are explained by multi-label annotations, our analysis of class confusions reveals other sources of accuracy degradation. We show that simple class-conditional augmentation strategies informed by our framework improve performance on the negatively affected classes.
Submission Number: 82
Loading