Abstract: Over-parameterization in machine learning often leads to models heavily relying on ‘spurious’ features, which lack causal relationships with the true labels. This reliance can significantly impair the model’s performance, especially concerning minority subgroups. Alleviating this issue is particularly challenging in the absence of subgroup labels. To improve the generalizability of the model without the subgroup annotations, we propose LogitMixup Feature Reweighting (LFR), a novel two-stage method to enhance the robustness of the model. Initially, we train an auxiliary model deliberately tuned to amplify spurious correlations. We subsequently divide the dataset into two pseudo-groups based on the output logits of the auxiliary model: one group aligns with the bias, while the other conflicts with the bias. We then apply mixup augmentation on pairs from these two groups within the same class, organizing a reweighting dataset. In the following stage, we freeze the feature extractor and retrain only the decision layer of the model originally trained via empirical risk minimization. LFR enhances the robustness of the model without requiring additional supervision, such as annotation or labels of spurious attributes. Furthermore, LFR retrains only the decision layer of the model with only a few epochs, which does not require supervision at model selection time. Our experiments on benchmark datasets demonstrate that LFR improves the model’s group robustness. LFR not only outperforms the existing methods that do not use group labels but also competes closely with ‘oracle’ methods that utilize subgroup annotations.
External IDs:doi:10.1007/978-981-97-8702-9_22
Loading