Geometric Stability of Representation Manifolds as a Training-Free Diagnostic for Studying Data Augmentations
Keywords: self-supervised learning, data augmentation, representation geometry, Procrustes analysis, feature manifold, medical imaging, training-free diagnostic
TL;DR: We propose a training-free geometric diagnostic to evaluate the effect of augmentations on representation manifolds.
Abstract: Data augmentation is the primary mechanism for defining representation invariances in self-supervised learning (SSL), but the selection of augmentations remains largely empirical and computationally costly, as it typically requires repeated full training runs for validation. We introduce a training-free diagnostic that evaluates augmentations based on the geometric stability of the learned embedding manifold. Our method uses Procrustes analysis to measure the non-rigid distortions caused by augmentation operators in the feature space of a strong pre-trained encoder. We observe a statistically significant relationship between geometric preservation and the semantic consistency of representations in high-dimensional space. These findings establish global geometric stability as a computationally efficient, training-free diagnostic for studying the semantic effects of data augmentations. Furthermore, we investigate the boundary conditions by analyzing situations in which geometric proximity decouples from instance-level discriminability. Our framework provides a principled and mathematically grounded approach for evaluating augmentations in medical and general-purpose foundation models.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Style Files: I have used the style files.
Submission Number: 109
Loading