Measuring and Mitigating Racial Bias in Embedding Models: A Comparative Study for Law Enforcement Retrieval

Archan Dutta

Measuring and Mitigating Racial Bias in Embedding Models: A Comparative Study for Law Enforcement Retrieval

Archan Dutta

Published: 18 Apr 2026, Last Modified: 26 Apr 2026ACL 2026 Industry Track PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: embedding bias, racial bias, law enforcement AI, semantic retrieval, fairness in NLP, algorithmic fairness, bias measurement, model selection, deployment evaluation, high-stakes NLP

TL;DR: We present the first comparative audit of racial bias in embedding models used for law enforcement retrieval, showing substantial variance across models and offering cost-effective mitigation through model selection.

Abstract: Embedding models are often used for semantic retrieval in high-stakes domains such as law enforcement, where biased outputs can have severe consequences. We systematically measure racial bias in six widely used embedding models by computing similarity scores between crime incident texts that include racial identity tokens and simple law enforcement queries. The analysis reveals that racial descriptors consistently affect cosine similarity scores and retrieval rankings for semantically identical crime incidents. All models exhibit statistically significant bias, with magnitude varying across models. This study provides a comprehensive methodology and metrics to aid the selection of embedding models when deploying NLP-based systems in the law enforcement domain. Organizations can reduce bias at low cost through informed model selection. The methodology establishes reproducible metrics for measuring bias in embedding-based systems.

Submission Type: Discovery

Copyright Form: pdf

Submission Number: 515

Loading