SHADES: Towards a Multilingual Assessment of Stereotypes in Large Language Models

ACL ARR 2024 June Submission2700 Authors

15 Jun 2024 (modified: 22 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large language models (LLMs) have shown impressive performance on a series of task benchmarks, including open-ended generation for English, spurring a race to increase the number of languages covered by models. However, LLMs also reproduce and exacerbate a range of social biases that are present in the text used for its training data. While research has attempted to identify and mitigate such biases, most efforts have been concentrated around English. In this paper, we introduce a new multilingual dataset for examining culturally-specific stereotypes that may be learned by Large Language Models (LLMs), and templates to enable further generation of bias evaluation data. The dataset includes stereotypes from 20 geopolitical regions and 15 languages. We demonstrate its utility in a series of evaluations for both "base" and "Instruct" language models. Initial results suggest that there is a vast difference in the representation of stereotypes across both models and languages.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: corpus creation, benchmarking, language resources, multilingual corpora, lexicon creation, automatic creation and evaluation of language resources, NLP datasets, evaluation methodologies, evaluation, datasets for low resource languages, metrics
Contribution Types: Model analysis & interpretability, Data resources
Languages Studied: Arabic, Bengali, Chinese, Chinese (Traditional), Dutch, English, French, German, Hindi, Italian, Marathi, Polish, Brazilian Portuguese, Romanian, Russian, Spanish
Submission Number: 2700
Loading