Aligning Cultural Behaviors in Arabic Language Models via Preferences

Aligning Cultural Behaviors in Arabic Language Models via Preferences

ACL ARR 2026 January Submission9296 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models, Cultural Alignment

Abstract: Arabic language models have advanced rapidly, yet cultural alignment based on preferences remains underexplored despite its critical role in real-world applications. Previous work has focused primarily on instruction-tuning for factual knowledge, overlooking nuanced cultural behaviors. We introduce \textbf{ArabPref}, the first large-scale preference-based dataset covering 22 Arab nations, capturing both preferred and dispreferred behaviors across 200 fine-grained topics such as social etiquette, food and drink, religion, and travel. Our contribution includes two resources in English and Modern Standard Arabic: a culturally grounded preference dataset for training and evaluation, and a multiple-choice benchmark designed to assess culturally aligned behavior across nations. All test data is rigorously validated by native speakers to ensure authenticity. To optimize cultural alignment, we experiment with fine-tuning, DPO, KTO across multilingual and Arabic-centric language models, evaluating performance on both generation tasks and our cultural multiple-choice benchmark. By releasing both a training dataset and an evaluation benchmark, we provide a foundation for advancing culturally aware Arabic language modeling and enable significantly better cultural alignment compared to existing resources. Data and code are available at \url{http://anonymous.for.review}

Paper Type: Long

Research Area: Computational Social Science, Cultural Analytics, and NLP for Social Good

Research Area Keywords: safety and alignment, language/cultural bias analysis

Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Data resources, Data analysis

Languages Studied: English, Arabic

Submission Number: 9296

Loading