CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting

Huihan Li; Liwei Jiang; Nouha Dziri; Xiang Ren; Yejin Choi

CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting

Huihan Li, Liwei Jiang, Nouha Dziri, Xiang Ren, Yejin Choi

Published: 10 Jul 2024, Last Modified: 26 Aug 2024COLMEveryoneRevisionsBibTeXCC BY 4.0

Research Area: Evaluation, Societal implications, LMs for everyone

Keywords: culture, bias, nationality, prompting, knowledge

TL;DR: This paper evaluates the global culture perception of large language models on 110 cultures.

Abstract: As the utilization of large language models (LLMs) has proliferated world-wide, it is crucial for them to have adequate knowledge and fair representation for diverse global cultures. In this work, we uncover culture perceptions of three SOTA models on 110 countries and regions on 8 culture-related topics through culture-conditioned generations, and extract symbols from these generations that are associated to each culture by the LLM. We discover that culture-conditioned generation consist of linguistic “markers” that distinguish marginalized cultures apart from default cultures. We also discover that LLMs have an uneven degree of diversity in the culture symbols, and that cultures from different geographic regions have different presence in LLMs’ culture-agnostic generation. Our findings promote further research in studying the knowledge and fairness of global culture perception in LLMs.

Supplementary Material: zip

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html

Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html

Submission Number: 507

Loading