MemeBridge: A Dataset for Benchmarking and Mitigating the Bidirectional Cultural Gap in Meme Interpretation
Abstract: Communicating with people of different cultures is a complex challenge. Memes, as a prevalent form of online communication, can lead to misunderstandings when used improperly in communication. Large language models (LLMs) can potentially help; however, there is a notable lack of meme datasets that provide context-based explanations and potential misunderstandings for training and evaluating LLMs. To address this gap, we introduce a carefully curated meme dataset \textsc{MemeBridge}. The accuracy of the dataset was manually examined and quantitative evaluations were performed. Initial probing of various LLMs developed by teams with different cultural backgrounds revealed they have a certain level of cross-cultural understanding and the ability to recognize cultural differences, despite some limitations in meme comprehension. Besides, fine-tuning these LLMs with our dataset led to performance improvements, underscoring the importance of context-rich datasets in enhancing the cultural understanding capacity of LLMs.
Paper Type: Long
Research Area: Computational Social Science and Cultural Analytics
Research Area Keywords: human behavior analysis, NLP tools for social analysis;
Contribution Types: Model analysis & interpretability, Data resources, Data analysis
Languages Studied: English
Submission Number: 2542
Loading