For Robust Worst-Group Accuracy, Ignore Group Annotations

TMLR Paper2901 Authors

20 Jun 2024 (modified: 04 Nov 2024)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Existing methods for last layer retraining that aim to optimize worst-group accuracy (WGA) rely heavily on well-annotated groups in the training data. We show, both in theory and practice, that annotation-based data augmentations using either downsampling or upweighting for WGA are susceptible to domain annotation noise. The WGA gap is exacerbated in high-noise regimes for models trained with vanilla empirical risk minimization. To this end, we introduce Regularized Annotation of Domains (RAD) to train robust last layer classifiers without needing explicit domain annotations. Our results show that RAD is competitive with other recently proposed domain annotation-free techniques. Most importantly, RAD outperforms state-of-the-art annotation-reliant methods even with only 5\% noise in the training data for several publicly available datasets.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Yu_Yao3
Submission Number: 2901
Loading