NLP-based Feature Extraction for the Detection of COVID-19 Misinformation Videos on YouTubeDownload PDF

22 Jun 2020, 19:39 (edited 29 Jun 2020)ACL 2020 Workshop NLP-COVID SubmissionReaders: Everyone
  • Keywords: YouTube, misinformation, conspiracy, detection, user comments
  • TL;DR: We classify conspiratorial user comments and then use the percentage of them as a feature to detect misinformation videos on YouTube.
  • Abstract: We present a simple NLP methodology for detecting COVID-19 misinformation videos on YouTube by leveraging user comments. We use transfer-learning pre-trained models to generate a multi-label classifier that can categorize conspiratorial content. We use the percentage of misinformation comments on each video as a new feature for video classification. We show that the inclusion of this feature in simple models yields an accuracy of up to 82.2%. Furthermore, we verify the significance of the feature by performing a Bayesian analysis. Finally, we show that adding the first hundred comments as tf-idf features increases the video classifier accuracy by up to 89.4%.
6 Replies