SHIELD: A Benchmark Study on Zero-Shot Detection of AI-Edited Images with Vision Language Models

Siyuan Cheng; Hanxi Guo; Zhenting Wang; Xiangyu Zhang; Lingjuan Lyu

SHIELD: A Benchmark Study on Zero-Shot Detection of AI-Edited Images with Vision Language Models

Siyuan Cheng, Hanxi Guo, Zhenting Wang, Xiangyu Zhang, Lingjuan Lyu

Published: 24 Sept 2025, Last Modified: 07 Nov 2025NeurIPS 2025 Workshop GenProCCEveryoneRevisionsBibTeXCC BY 4.0

Track: Regular paper

Keywords: AI-Edited Image Detection, Vision Language Models, Deepfake Detection

TL;DR: This paper introduces SHIELD, a benchmark study for zero-shot detection of AI-edited images using vision language models.

Abstract: The rapid progress of generative AI has enabled powerful image editing tools that can convincingly manipulate localized regions of real images. Such AI-edited images are increasingly exploited to spread misinformation, yet existing detectors, which primarily designed for whole-image synthesis or DeepFakes, struggle to generalize and often fail against partial manipulations. In this paper, we study AI-edited image detection from a zero-shot perspective, drawing inspiration from how humans approach the task. Humans are generally reliable because they are exposed primarily to authentic images and treat unusual or inconsistent regions as anomalies. Vision language models (VLMs), trained on large and diverse image–text corpora, offer a scalable analogue to this human ability. Leveraging VLMs for zero-shot inference provides a principled framework for anomaly detection while mitigating the overfitting issues that plague training-based detectors. We present SHIELD, the first benchmark study of zero-shot AI-edited image detection using VLMs. Our evaluation covers 24 models under two prompting strategies (direct prompting and Chain-of-Thought prompting) and two inference modes (greedy decoding and sampling). The results show that detection accuracy generally correlates with overall model capability. Notably, direct prompting with greedy decoding achieves the strongest performance, suggesting a "first impression" effect. We also examine detection performance under different datasets, generative models, and editing methods, and discuss potential directions for improving the detection accuracy. The source code used in this study is available at https://github.com/Megum1/SHIELD.

Submission Number: 49

Loading