On the Unreasonable Effectiveness of Last-layer Retraining

On the Unreasonable Effectiveness of Last-layer Retraining

TMLR Paper6681 Authors

27 Nov 2025 (modified: 16 Feb 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Last-layer retraining (LLR) methods --- wherein the last layer of a neural network is reinitialized and retrained on a held-out set following ERM training --- have garnered interest as an efficient approach to rectify dependence on spurious correlations and improve performance on minority groups. Surprisingly, LLR has been found to improve worst-group accuracy even when the held-out set is an imbalanced subset of the training set. We initially hypothesize that this ``unreasonable effectiveness'' of LLR is explained by its ability to mitigate neural collapse through the held-out set, resulting in the implicit bias of gradient descent benefiting robustness. Our empirical investigation does not support this hypothesis. Instead, we present strong evidence for an alternative hypothesis: that the success of LLR is primarily due to better group balance in the held-out set. We conclude by showing how the recent algorithms CB-LLR and AFR perform implicit group-balancing to elicit a robustness improvement.

Submission Type: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: We have updated our submission according to reviewer feedback. Major additions include: 1. Appendix A.4: Experiments with Swin Transformer (an efficient Vision Transformer). 2. Appendix C: Derivation of the variance of our proposed Hutchinson estimator. 3. Various writing changes and clarifications throughout. We have highlighted all writing changes in red for the reviewers' convenience.

Assigned Action Editor: ~Hongyang_R._Zhang1

Submission Number: 6681

Loading