Propitter: A Twitter Corpus for Computational Propaganda Detection

Published: 01 Jan 2023, Last Modified: 10 Dec 2024MICAI (2) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Social networks have become one of the most popular ways for people to communicate with others and get informed. For this reason, these platforms are being widely used to spread propaganda and thereby influence the beliefs, opinions and actions of their users. Despite its relevance, current computational approaches to detect propaganda are mainly focused on analyzing its presence in news articles, and have not been equally developed for other sources of information, such as Twitter. In this paper, we introduce Propitter, a new corpus for propaganda detection with over 385K tweets. Its construction was based on a novel methodology that refines what is obtained by distant supervision through a cross-domain filtering and a subsequent in-domain expansion. We provide baseline results for this corpus, using both traditional and transformer-based methods, and also present an experiment that points to the need for methods that go beyond topics and allow for capturing the propaganda styles.
Loading