Detecting Biased Language in Icelandic: A Named Entity Recognition Approach for Socially Responsible Text Analysis
Keywords: Named Entity Recognition, Bias Detection, Toxicity Detection, Lexicon
TL;DR: A publicly available framework with models, data, and a web app detects biased language in Icelandic.
Abstract: Bias research has been limited for Icelandic, a low-resource language with few NLP tools for socially aware text analysis. We address this gap by developing a publicly accessible web application that detects biased and stigmatizing vocabulary in Icelandic text and provides category-specific feedback to encourage reflection and more inclusive communication. The application is powered by the best-performing of three Named Entity Recognition (NER) models that we trained on automatically annotated data derived from a manually compiled lexicon of over 2,000 biased terms and phrases across 14 social categories, ranging from misogyny and queerphobia to religious and ethnic bias. All components, including the lexicon, annotated dataset, the three fine-tuned models and the web application code, are freely available online, offering a transparent and reproducible framework for bias detection in low-resource languages.
Track: ML track
Submission Number: 8
Loading