Regularising for invariance to data augmentation improves supervised learning

Regularising for invariance to data augmentation improves supervised learning

TMLR Paper435 Authors

14 Sept 2022 (modified: 17 Sept 2024)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Data augmentation is used in machine learning to make the classifier invariant to label-preserving transformations. Usually this invariance is only encouraged implicitly by sampling a single augmentation per image and training epoch. However, several works have recently shown that using multiple augmentations per input can improve generalisation or can be used to incorporate invariances more explicitly. In this work, we first empirically compare these recently proposed objectives that differ in whether they rely on explicit or implicit regularisation and at what level of the predictor they encode the invariances. We show that the predictions of the best performing method are also the most similar when compared on different augmentations of the same input. Inspired by this observation, we propose an explicit regulariser that encourages this invariance on the level of individual model predictions. Through extensive experiments on CIFAR-100 and ImageNet we show that this explicit regulariser (i) improves generalisation and (ii) equalises performance differences between all considered objectives. Our results suggest that objectives that encourage invariance on the level of the neural network features itself generalise better than those that only achieve invariance by averaging predictions of non-invariant models.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: Incorporating minor corrections requested by reviewers: * Adding a reference to Wang et. el 2022 * Adding comparisons of different invariance measures in Appendix D.1 and figure 8. * Adding additional results for a WideResNet with batch norm in Appendix D.4 and figure 10. * Updated manuscript phrasing and notations regarding the usage of symmetrised KL divergence. * Fixed incomplete fields in references.

Assigned Action Editor: ~Jakub_Mikolaj_Tomczak1

Submission Number: 435

Loading