Informed Augmentation Selection Improves Tabular Contrastive Learning

Published: 2025, Last Modified: 12 Sept 2025PAKDD (1) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The effectiveness of contrastive learning heavily depends on data augmentations, yet the suitability of tabular augmentation techniques for contrastive learning remains unclear. In this study, we assess the compatibility of various tabular augmentation techniques with contrastive learning by examining their impact on feature space characteristics, particularly uniformity and alignment, which serve as proxies for downstream performance. Our investigation, employing six prevalent tabular augmentation techniques, reveals the substantial influence of augmentations on feature space quality. We find that achieving a balance between uniformity and alignment is essential for good downstream performance. Based on these insights, we propose a novel framework for selecting augmentation combinations that strike this balance. Experimental results on 21 tabular datasets from the OpenML-CC18 benchmark and 5 TCGA cancer genomics datasets consistently demonstrate the effectiveness of our proposed framework in enhancing downstream performance.
Loading