How Data Augmentation Shapes Neural Representations

Tianxiao He; Alex H Williams; Sarah E Harvey

How Data Augmentation Shapes Neural Representations

Tianxiao He, Alex H Williams, Sarah E Harvey

Published: 02 Mar 2026, Last Modified: 14 May 2026ICLR 2026 Re-Align Workshop TalkEveryoneRevisionsBibTeXCC BY 4.0

Track: long paper (up to 10 pages)

Domain: machine learning

Abstract: Data augmentation is widely recognized for improving generalization in deep networks, yet its impact on the geometry of learned representations remains poorly understood. In this work, we characterize how different data augmentation strategies reshape internal representations in neural networks. Using tools from shape analysis, we embed network hidden representations into a metric space where distance is invariant to scaling, translation, rotation and reflection. We show that increasing augmentation strength leads to reproducible trajectories in this space, and that different augmentation types steer representations in distinct directions. Moreover, we investigate how neural representation shapes are distorted along data augmentation trajectories, and show that insights from neural geometry can predict which representations provide the most improvement when ensembling models. Our results reveal shared geometric patterns across architectures and seeds, and suggest that analyzing shape-space trajectories offers a principled tool for understanding and comparing data augmentation methods.

Presenter: ~Tianxiao_He1

Submission Number: 98

Loading