Navigating the Cultural Kaleidoscope: A Hitchhiker's Guide to Sensitivity in Large Language Models

Published: 01 Jan 2025, Last Modified: 07 Oct 2025NAACL (Long Papers) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Cultural harm stems in LLMs whereby these models fail to align with specific cultural norms, resulting in misrepresentations or violations of cultural values. This work addresses the challenges of ensuring cultural sensitivity in LLMs, especially in small-parameter models that often lack the extensive training data needed to capture global cultural nuances. We present two key contributions: (1) A cultural harm test dataset, created to assess model outputs across different cultural contexts through scenarios that expose potential cultural insensitivities, and (2) A culturally aligned preference dataset, aimed at restoring cultural sensitivity through fine-tuning based on feedback from diverse annotators. These datasets facilitate the evaluation and enhancement of LLMs, ensuring their ethical and safe deployment across different cultural landscapes. Our results show that integrating culturally aligned feedback leads to a marked improvement in model behavior, significantly reducing the likelihood of generating culturally insensitive or harmful content.
Loading