Track: Main Papers Track (6 to 9 pages)
Keywords: Captioning Models, Fairness, Racial Bias, Race and Ethnicity, Generative AI
TL;DR: Analyzing captioning models on benchmarks, we find persistent racial attribution disparities, with non-white groups over-racialized, exposing embedded societal biases in multimodal AI models.
Abstract: Multi-modal AI Models that intrinsically reproduce undesirable social biases remain a critical challenge; racial diversity and fairness are key concerns in AI literature. This paper investigates racial bias in captioning models, focusing on how different racial groups are represented in generated text sentences. We explored a new dataset, TSE (a self-hetero-identified dataset from Brazil), and two others (UTKFace, FairFace) to analyze State-Of-The-Art models to assess their racial attribution disparities. Our findings reveal that certain racial groups are disproportionately referenced, with the white race often being treated as the default while other races receive explicit mentions at varying rates. This discrepancy reflects intrinsic societal biases embedded in widespread models, perpetuating racial stereotypes and reinforcing systemic inequities. We also reveal that other racial domains, such as Brazilian, are poorly captured by existing models, leading to disparities in racial representation. Our study underscores the urgent need for bias mitigation strategies in generative Image-Text-to-Text models, ensuring fairer and more inclusive representation across diverse racial identities.
Submission Number: 15
Loading