Alignment, Convexity and Completeness: Mechanisms Behind GroupDRO

16 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: robustness fairness deep learning
TL;DR: We find that GDRO works by both a strong classifier effect and by inducing lower completeness in learned representations.
Abstract: Models trained with Empirical Risk Minimization (ERM) often fail to generalize under spurious correlations. Group Robustness Methods (GRMs)—notably Group DRO (GDRO)—mitigate this by reweighting losses across groups defined by labels and spurious attributes, yet why they work remains only partially understood. We study the learning dynamics of GRMs and their effects on both the classifier head and the representation. Theoretically, in a head-only fine-tuning setting (fixed features), we analyze the classifier learned by GDRO and show: (i) GDRO aligns less with a spurious classifier and more with an oracle non-spurious classifier than ERM; (ii) when group losses are $\mu$-strongly convex, the alignment gap controls performance, yielding an upper bound on the worst-group performance gap between ERM and GDRO; and (iii) for convex losses, adding L2 regularization induces $\mu$-strong convexity, so the same guarantees apply—providing an explanation for the empirical gains of GDRO with L2 reported in prior work. Empirically, across standard image and text benchmarks, we confirm the predicted alignment behavior. Beyond the head, under end-to-end training GDRO also reshapes the representation: through a measure called Completeness, we show that task-relevant information is spread across multiple dimensions in GDRO while ERM tends to concentrate it in fewer, making it more susceptible to rely on spurious attributes for prediction. Together, our theory and measurements clarify the mechanisms by which GroupdDRO outperforms ERM.
Supplementary Material: zip
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 7999
Loading