Keywords: misinformation, LLM-generated misinformation, fringe social media
TL;DR: This study shows that unmoderated LLMs on fringe social media generate convincing misinformation that bypasses existing detection methods, exposing critical gaps in misinformation mitigation.
Abstract: The rapid advancements in large language models (LLMs) have created unprecedented opportunities for content generation but also introduced significant challenges, particularly in combating misinformation. While moderated LLMs implement safeguard measures to reduce misuse, unmoderated systems hosted on fringe social networks present an emerging and underexplored threat. In this study, we investigate the dangers of unmoderated LLMs through a case study on COVID-19 misinformation generated using Gab AI, a platform characterized by minimal content moderation. Using two distinct prompting strategies, we produced persuasive misinformation posts and evaluated the effectiveness of existing detection methods. Our results show that zero-shot detection approaches consistently fail to identify misinformation, whereas few-shot detection using carefully selected exemplars and Chain-of-Thought reasoning significantly improves performance. These findings highlight the unique challenges posed by short-form LLM-generated misinformation from fringe social media platforms, a domain that has received little attention in prior research. This work represents an exploratory step toward understanding the limitations of current detection methods and the broader risks introduced by unmoderated LLM systems proliferating in such environments.
Submission Number: 122
Loading