One Model to Detect Them All? Comparing LLMs, BERT and Traditional ML in Cross-Platform Conspiracy Detection
Abstract: The proliferation of Population Replacement Conspiracy Theories (PRCTs) on social media platforms poses significant challenges for content moderation systems and societal cohesion. This paper conducts a comparative analysis of various approaches for detecting PRCT content, with particular focus on their generalization capabilities across platforms and languages. We evaluate several distinct methodologies: pure few-shot learning utilizing Large Language Models (such as Deepseek-V3 and GPT-4o), BERT-based models fine-tuned for this task, and traditional machine learning models. Through analysis of 56,085 YouTube comments and evaluation using a manually annotated gold standard, we found a superior performance of few-shot learning, achieving 94.5\% accuracy with DeepSeek and 91.0\% with GPT-4o, though DeepSeek showed worse generalization power with higher performance drops in different contexts. These results significantly outperform traditional methods and show robust cross-platform and cross-lingual generalization when tested on multilingual Telegram data. To support reproducibility, both gold-standard datasets and annotation guidelines are made publicly available.
Paper Type: Long
Research Area: Computational Social Science and Cultural Analytics
Research Area Keywords: human behavior analysis, stance detection, hate-speech detection, misinformation detection and analysis, language/cultural bias analysis, NLP tools for social analysis, quantitative analyses of news and/or social media
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources, Data analysis
Languages Studied: English, Spanish, Portuguese
Submission Number: 2444
Loading