On the Generalisation Capability of Local Surface Frames in Detecting Diffusion-Based Facial Images

Andrea Ciamarra, Roberto Caldelli, Alberto Del Bimbo

Published: 2025, Last Modified: 04 Mar 2026WACV (Workshops) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Extraordinary unreal images can be realised with pow-erful AI techniques. Various tools available to everyone are able to recreate high quality contents, especially generating entire fully synthetic images. Among the existing architectures, diffusion-based models can easily produce any kind of images, including human facial images, by giving a prompt like a text. Such false contents are often used to spread dis-information and this raises concerns about people security. At the present, it is getting hard to develop reliable instruments to distinguish between real and generated (even non-existing) people. Moreover, the large amount of diffusion-based implementations poses the problem for such detectors to generalise on novel generative techniques. To address these issues, we propose to investigate the capacity of a distinctive feature, based on the image acquisition environment, to individuate diffusion-based face images from the pristine ones. In fact, generated images should not contain the characteristics that are proper of the acquisition phase performed through a real camera. Such inconsistencies can be highlighted by means of recently introduced local surface frames. This feature takes into account objects and surfaces involved in the scene, which all impact the camera acquisition process, along with further intrinsic information tied to the device, as well as lighting and reflections affecting the entire scenario. The paper explores the ability of this feature to generalise towards different datasets and new generative methods unknown during training. Experimental results highlight that such a feature still provides significant levels of detection accuracy also in these cases.

External IDs:dblp:conf/wacv/CiamarraCB25