When Personalization Harms Performance: Reconsidering the Use of Group Attributes in Prediction

Published: 24 Apr 2023, Last Modified: 21 Jun 2023ICML 2023 OralPosterEveryoneRevisions
Abstract: Machine learning models are often personalized with categorical attributes that define groups. In this work, we show that personalization with *group attributes* can inadvertently reduce performance at a *group level* -- i.e., groups may receive unnecessarily inaccurate predictions by sharing their personal characteristics. We present formal conditions to ensure the *fair use* of group attributes in a prediction task, and describe how they can be checked by training one additional model. We characterize how fair use conditions be violated due to standard practices in model development, and study the prevalence of fair use violations in clinical prediction tasks. Our results show that personalization often fails to produce a tailored performance gain for every group who reports personal data, and underscore the need to evaluate fair use when personalizing models with characteristics that are protected, sensitive, self-reported, or costly to acquire.
Submission Number: 3692
Loading