To Pool or Not To Pool: Analyzing the Regularizing Effects of Group-Fair Training on Shared Models

Cyrus Cousins, I. Elizabeth Kumar, Suresh Venkatasubramanian

Published: 2024, Last Modified: 07 Feb 2026AISTATS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In fair machine learning, one source of performance disparities between groups is overfitting to groups with relatively few training samples. We derive group-specific bounds on the generalization error of welfare-centric fair machine learning that benefit from the larger sample size of the majority group. We do this by considering group-specific Rademacher averages over a restricted hypothesis class, which contains the family of models likely to perform well with respect to a fair learning objective (e.g., a power-mean). Our simulations demonstrate these bounds improve over a naïve method, as expected by theory, with particularly significant improvement for smaller group sizes.