Theoretical Guarantees of Data Augmented Last Layer Retraining Methods

Published: 01 Jan 2024, Last Modified: 13 May 2025ISIT 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Ensuring fair predictions across many distinct sub-populations in the training data can be prohibitive for large models. Recently, simple linear last layer retraining strategies, in combination with data augmentation methods such as upweighting, downsampling and mixup, have been shown to achieve state-of-the-art performance for worst-group accuracy, which quanti-fies accuracy for the least prevalent subpopulation. For linear last layer retraining and the abovementioned augmentations, we present the optimal worst-group accuracy when modeling the distribution of the latent representations (input to the last layer) as Gaussian for each subpopulation. We evaluate and verify our results for both synthetic and large publicly available datasets.
Loading