Stereotype Bias in a Bilingual Setting: A Culturally Grounded Evaluation in Kazakhstan

ACL ARR 2025 July Submission1393 Authors

29 Jul 2025 (modified: 06 Sept 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Stereotype bias in language models has been widely examined in English, but remains largely understudied in bilingual contexts where multiple linguistic and cultural systems interact. This gap is especially important in regions where language use reflects complex historical and sociopolitical influences. In this work, we focus on Kazakhstan, a bilingual society where Kazakh, a low-resource Turkic language, and Russian, a high-resource Slavic language, are both actively used and frequently code-mixed in everyday communication. We introduce \aq, a high-quality, human-verified dataset consisting of 5,634 stereotype-bearing statements in Kazakh, Russian, and code-mixed forms, covering six culturally salient domains. We evaluate both multilingual and Kazakh-specific language models using perplexity-based scoring and pretraining simulations, and find that stereotype bias is most pronounced in code-mixed inputs. Our results highlight the limitations of existing evaluation frameworks and emphasize the need for culturally grounded, linguistically inclusive benchmarks to better assess and mitigate bias in language models.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: low-resource language, bias, kazakh, russian, code-switching
Contribution Types: Approaches to low-resource settings, Data resources, Data analysis
Languages Studied: Kazakh,Russian
Reassignment Request Area Chair: This is not a resubmission
Reassignment Request Reviewers: This is not a resubmission
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: N/A
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: Section 4
B2 Discuss The License For Artifacts: N/A
B3 Artifact Use Consistent With Intended Use: Yes
B3 Elaboration: Section 4
B4 Data Contains Personally Identifying Info Or Offensive Content: Yes
B4 Elaboration: Section 3
B5 Documentation Of Artifacts: Yes
B5 Elaboration: Section 3
B6 Statistics For Data: Yes
B6 Elaboration: Section 3
C Computational Experiments: Yes
C1 Model Size And Budget: Yes
C1 Elaboration: Section 4 and Appendix
C2 Experimental Setup And Hyperparameters: Yes
C2 Elaboration: Appendix C
C3 Descriptive Statistics: Yes
C3 Elaboration: Section 4
C4 Parameters For Packages: N/A
D Human Subjects Including Annotators: Yes
D1 Instructions Given To Participants: Yes
D1 Elaboration: Appendix G
D2 Recruitment And Payment: Yes
D2 Elaboration: Section 3
D3 Data Consent: N/A
D4 Ethics Review Board Approval: N/A
D5 Characteristics Of Annotators: N/A
E Ai Assistants In Research Or Writing: No
E1 Information About Use Of Ai Assistants: N/A
Author Submission Checklist: yes
Submission Number: 1393
Loading