Characterizing Sociodemographic Error Disparities in Large-scale Language-based Health Predictions

ACL ARR 2025 May Submission5381 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Most NLP bias studies focus on individual- or document-level tasks, yet fields where bias has substantial consequences, like public health, operate at the community-level. We systematically examine sociodemographic error disparities in NLP models predicting $\textit{community-level}$ health outcomes across billions of community-mapped messages and evaluate four sociodemographic factor inclusion strategies. We introduce the $\textit{Bilateral Concentration Index}$ (BCI) to quantify non-monotonic disparities missed by traditional metrics, finding all baseline language-alone models had moderate disparities (average BCI=6.6\%). However, while incorporating sociodemographics into modeling consistently improved $\textit{accuracy}$, it often increased $\textit{disparities}$, from negligible (concatenation: BCI=6.6\%) to significantly (adaptation: BCI=8.2\%), suggesting a cost-benefit trade-off. Largest disparities in error emerged over education and income (BCI= 2.7–16.4\%), reducing accuracy for low-income (and sometimes high-income) communities, which could disadvantage them if used for policy decisions. These findings suggest the need to evaluate error disparities alongside accuracy to ensure fairness as models enter real-world applications.
Paper Type: Short
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: model bias/fairness evaluation, model bias/unfairness mitigation, ethical considerations in NLP applications
Contribution Types: Model analysis & interpretability, Data analysis
Languages Studied: English
Submission Number: 5381
Loading