So-Fake: Benchmarking Social Media Image Forgery Detection

ICLR 2026 Conference Submission9906 Authors

17 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLMs, Deepfake detection
Abstract: Recent advances in AI-powered generative models have enabled the creation of increasingly realistic synthetic images, posing significant risks to information integrity and public trust on social media platforms. While robust detection frameworks and diverse, large-scale datasets are essential to mitigate these risks, existing academic efforts remain limited in scope: current datasets lack the diversity, scale, and realism required for social media contexts, and evaluation protocols rarely account for explanation or out-of-domain generalization. To bridge this gap, we introduce \textbf{So-Fake}, a comprehensive social media-oriented dataset for forgery detection consisting of two key components. First, we present \textbf{So-Fake-Set}, a large-scale dataset with over \textbf{2 million} photorealistic images from diverse generative sources, synthesized using a wide range of generative models. Second, to rigorously evaluate cross-domain robustness, we establish \textbf{So-Fake-OOD}, a novel and large-scale (\textbf{100K}) out-of-domain benchmark sourced from real social media platforms and featuring synthetic imagery from commercial models explicitly excluded from the training distribution, creating a realistic testbed that mirrors actual deployment scenarios. Leveraging these complementary datasets, we present \textbf{So-Fake-R1}, a baseline framework that applies reinforcement learning to encourage interpretable visual rationales. Experiments show that So-Fake surfaces substantial challenges for existing methods. By integrating a large-scale dataset, a realistic out-of-domain benchmark, and a multi-dimensional evaluation protocol, So-Fake establishes a new foundation for social media forgery detection research.
Primary Area: datasets and benchmarks
Submission Number: 9906
Loading