Detecting Online Community Practices with Large Language Models: A Case Study of Pro-Ukrainian Publics on Twitter

ACL ARR 2024 June Submission3896 Authors

16 Jun 2024 (modified: 02 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Communities on social media display distinct patterns of linguistic expression and behaviour, collectively referred to as practices. These practices can be traced in textual exchanges, and reflect the intentions, knowledge, values, and norms of users and communities. This paper introduces a comprehensive methodological workflow for computational identification of such practices within social media texts. By focusing on supporters of Ukraine during the Russia-Ukraine war in (1) the activist collective NAFO and (2) the Eurovision Twitter community, we present a gold-standard data set capturing their unique practices. Using this corpus, we perform practice prediction experiments with both open-source baseline models and OpenAI's large language models (LLMs). Our results demonstrate that closed-source models, especially GPT-4, achieve superior performance, particularly with prompts that incorporate salient features of practices, or utilize Chain-of-Thought prompting. This study provides a detailed error analysis and offers valuable insights into improving the precision of practice identification, thereby supporting context-sensitive moderation and advancing the understanding of online community dynamics.
Paper Type: Long
Research Area: Computational Social Science and Cultural Analytics
Research Area Keywords: human behavior analysis, stance detection, sociolinguistics, NLP tools for social analysis, quantitative analyses of news and/or social media
Contribution Types: Approaches to low-resource settings, Data analysis, Theory
Languages Studied: English
Submission Number: 3896
Loading