Data Augmentation: A Fourier Analysis Perspective

Published: 23 Sept 2025, Last Modified: 29 Oct 2025NeurReps 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: data augmentation, Fourier analysis, invariance, equivariance, symmetry
TL;DR: We prove that, perhaps surprisingly, partial data augmentation using only small subsets of the group—rather than the entire group—suffices to achieve the full statistical benefits of data augmentation for learning under invariances.
Abstract: Data augmentation, which supplements a dataset with transformed copies of each datum according to a known symmetry group, provides a model-agnostic approach to enforcing invariances, in contrast to methods that encode symmetries directly into the model. Although data augmentation has proven effective in theory and practice, full group-sized augmentation is often computationally infeasible, prompting the question: Can sublinear augmentation still achieve the same performance as full augmentation in terms of generalization bounds and sample complexity? In this paper, we develop a theoretical framework based on Fourier analysis, showing that sublinear data augmentation can achieve the full statistical benefits achieved via full data augmentation. To our knowledge, this is the first proof of the efficacy of sublinear augmentation, highlighting an underexplored aspect of why augmentation remains a powerful and widely applicable strategy, even when performed only partially.
Submission Number: 45
Loading