Keywords: excess risk, SGD, overparametrization, fairness
TL;DR: We prove a group excess risk bound in the setting of overparameterized linear regression, and explore its implications for trustworthy machine learning.
Abstract: It has been observed that machine learning models trained using stochastic gradient descent (SGD) exhibit poor generalization to certain groups within and outside the population from which training instances are sampled. This has serious ramifications for the fairness, privacy, robustness, and out-of-distribution (OOD) generalization of machine learning. Hence, we theoretically characterize the inherent generalization of SGD-learned overparameterized linear regression to intra- and extra-population groups. We do this by proving an excess risk bound for an arbitrary group in terms of the full eigenspectra of the data covariance matrices of the group and population. We additionally provide a novel interpretation of the bound in terms of how the group and population data distributions differ and the group effective dimension of SGD, as well as connect these factors to real-world challenges in practicing trustworthy machine learning. We further empirically study our bound on simulated data.