A Symbolic Adversarial Learning Framework for Evolving Fake News Generation and Detection

A Symbolic Adversarial Learning Framework for Evolving Fake News Generation and Detection

ACL ARR 2025 May Submission634 Authors

14 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Rapid LLM advancements heighten fake news risks by enabling the automatic generation of increasingly sophisticated misinformation. Previous detection methods, including fine-tuned small models or LLM-based detectors, often struggle with its dynamically evolving nature. In this work, we propose a novel framework called the \textit{Symbolic Adversarial Learning Framework (SALF)}, which implements an adversarial training paradigm by an agent symbolic learning optimization process, rather than relying on numerical updates. SALF introduces a paradigm where the generation agent crafts deceptive narratives, and the detection agent uses structured debates to identify logical and factual flaws for detection, and they iteratively refine themselves through such adversarial interactions. Unlike traditional neural updates, we represent agents using agent symbolic learning, where learnable weights are defined by agent prompts, and simulate back-propagation and gradient descent by operating on natural language representations of weights, loss, and gradients. Experiments on two multilingual benchmark datasets demonstrate SALF's effectiveness, showing it generates sophisticated fake news that degrades state-of-the-art detection performance by up to 53.4\% in Chinese and 34.2\% in English on average. SALF also refines detectors, improving detection of refined content by up to 7.7\%. We hope our work inspires further exploration into more robust, adaptable fake news detection systems.

Paper Type: Long

Research Area: Ethics, Bias, and Fairness

Research Area Keywords: misinformation detection, adversarial training, generative models, LLM/AI agents, prompting, neurosymbolic approaches, ethical considerations in NLP applications

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: English, Chinese

Submission Number: 634

Loading