Keywords: Image Forgery Localization, Diffusion Models, Dataset Creation, Generative Models
TL;DR: We propose a new method for diffusion-era image forgery localization that adapts SAM2 with perturbation-driven heatmap prompts and automatically constructs diffusion-edit datasets.
Abstract: Image forgery localization in the generative AI era poses new challenges, as modern editing pipelines produce photorealistic, semantically coherent manipulations that evade conventional detectors while model capabilities evolve rapidly.
In response, we develop Detective SAM, a framework built on SAM2, a foundation model for image segmentation, that integrates perturbation-driven forensic clues with lightweight feature adapters and a mask adapter to convert forensic clues into forgery masks via automatic prompting.
Moreover, to keep up with the rapidly evolving capabilities of diffusion models, we introduce AutoEditForge: an automated diffusion edit generation pipeline spanning four edit types. This supplies high-quality data to maintain localization accuracy under newly released editors and enables continual fine-tuning for Detective SAM.
Across seven benchmark datasets and seven baselines, Detective SAM delivers stable out-of-distribution performance, averaging 36.99 IoU / 44.19 F1, a 33.67% relative IoU gain over the best baseline. Further, we show that state-of-the-art edits cause localization systems to collapse.
With 500 AutoEditForge samples, Detective SAM quickly adapts and restores performance, enabling practical, low-friction updates as editing models improve.
AutoEditForge, Detective SAM's pretrained weights and training script are available at the anonymized repository: https://anonymous.4open.science/r/Detective-SAM-9057/.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 7683
Loading