Track: regular paper (up to 6 pages)
Keywords: Group Robustness, Spurious Correlation, Group Annotation, Last layer fine-tuning
TL;DR: Mitigate spurious correlation without group label by fine-tuning last layer of ERM trained model using high gradient norm samples in a held-out set.
Abstract: This work addresses the limitations of deep neural networks (DNNs) in generalizing beyond training data due to spurious correlations. Recent research has demonstrated that models trained with empirical risk minimization learn both core and spurious features, often upweighting spurious ones in the final classification, which can frequently lead to poor performance on minority groups. Deep Feature Reweighting alleviates this issue by retraining the model's last classification layer using a group-balanced held-out validation set. However, relying on spurious feature labels during training or validation limits practical application, as spurious features are not always known or costly to annotate. Our preliminary experiments reveal that ERM-trained models exhibit higher gradient norms on minority group samples in the hold-out dataset. Leveraging these insights, we propose an alternative approach called GradTune, which fine-tunes the last classification layer using high-gradient norm samples. Our results on four well-established benchmarks demonstrate that the proposed method can achieve competitive performance compared to existing methods without requiring group labels during training or validation.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Format: Yes, the presenting author will attend in person if this work is accepted to the workshop.
Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding would significantly impact their ability to attend the workshop in person.
Presenter: ~Patrik_Joslin_Kenfack1
Submission Number: 57
Loading