How much Data is Augmentation Worth?

Jonas Geiping; Gowthami Somepalli; Ravid Shwartz-Ziv; Andrew Gordon Wilson; Tom Goldstein; Micah Goldblum

How much Data is Augmentation Worth?

Jonas Geiping, Gowthami Somepalli, Ravid Shwartz-Ziv, Andrew Gordon Wilson, Tom Goldstein, Micah Goldblum

Published: 21 Jul 2022, Last Modified: 05 May 2023SCIS 2022 PosterReaders: Everyone

Keywords: Data Augmentations, out-of-domain, Stochasticity, Flatness, Neural Networks, Invariance

TL;DR: We uncover mechanisms by which data augmentations regularize training, the relationship between augmented views and extra data, invariance, stochasticity, and flatness, also in the face of distribution shifts.

Abstract: Despite the clear performance benefits of data augmentations, little is known about why they are so effective. In this paper, we disentangle several key mechanisms through which data augmentations operate. Establishing an exchange rate between augmented and additional real data, we find that augmentations can provide nearly the same performance gains as additional data samples for in-domain generalization and even greater performance gains for out-of-distribution test sets. We also find that neural networks with hard-coded invariances underperform those with invariances learned via data augmentations. Our experiments suggest that these benefits to generalization arise from the additional stochasticity conferred by randomized augmentations, leading to flatter minima.

Confirmation: Yes

0 Replies

Loading