Keywords: AI-generated image detection, Training-free, self-supervised model, RandomResizedCrop
TL;DR: Self-supervised models learn robust representation of real images under cropping and resizing, which can be applied to detect AI-generated images without training.
Abstract: AI-generated image detection has become crucial with the rapid advancement of vision-generative models. Instead of training detectors tailored to specific datasets, we study a training-free approach leveraging self-supervised models without requiring prior data knowledge. These models, pre-trained with augmentations like $\texttt{RandomResizedCrop}$, learn to produce consistent representations across varying resolutions. Motivated by this, we propose $\textbf{WaRPAD},$ a training-free AI-generated image detection algorithm based on self-supervised models. Since neighborhood pixel differences in images are highly sensitive to resizing operations, WaRPAD first defines a base score function that quantifies the sensitivity of image embeddings to perturbations along high-frequency directions extracted via Haar wavelet decomposition. To simulate robustness against cropping augmentation, we rescale each image to a multiple of the model’s input size, divide it into smaller patches, and compute the base score for each patch. The final detection score is then obtained by averaging the scores across all patches. We validate WaRPAD on real datasets of diverse resolutions and domains, and images generated by 23 different generative models. Our method consistently achieves competitive performance and demonstrates strong robustness to test-time corruptions. Furthermore, as invariance to $\texttt{RandomResizedCrop}$ is a common training scheme across self-supervised models, we show that WaRPAD is applicable across self-supervised models.
Supplementary Material: zip
Primary Area: Social and economic aspects of machine learning (e.g., fairness, interpretability, human-AI interaction, privacy, safety, strategic behavior)
Submission Number: 11375
Loading