Fake News Detection: It’s All in the Data!

Soveatin Kuntur, Anna Wróblewska, Maria Ganzha, Marcin Paprzycki, Shelly Sachdeva

Published: 04 Feb 2026, Last Modified: 16 Mar 2026Applied SciencesEveryoneRevisionsCC BY-SA 4.0

Abstract: This brief survey acts as a fundamental resource for researchers beginning their exploration into fake news detection. It emphasizes the importance of dataset quality and diversity in enhancing the effectiveness of detection models, detailing key features, labeling systems, and prevalent biases. It also presents the challenges and limitations. By addressing ethical considerations (such as privacy and consent, societal impacts, transparency, and accountability) and best practices (annotation methodologies, real-world dynamics, reliability, and validity), we offer a thorough overview of current datasets. Additionally, our contribution includes a GitHub repository that aggregates publicly available datasets into a single, easily accessible portal, thereby supporting further research and development in the fight against fake news.

External IDs:doi:10.3390/app16031585