Multi-Perspective Frequency Domain Learning for Generalizable AI-Generated Image Detection

Published: 24 Oct 2025, Last Modified: 31 Jan 2026ECAI2025EveryoneCC BY 4.0
Abstract: The prevalence of generative models in image and video generation has raised extensive concerns about potential harm and misuse. To identify the truthfulness of generated images, most of the existing methods typically apply Fast Fourier Transform (FFT) for frequency extraction. An existing problem is that the frequency-domain representations extracted by FFT are not comprehensive for AI-generated image detection. In this paper, we propose a Multi-perspective Frequency Domain Learning (MFDL) framework, which aims to learn both generalized and discriminative frequency representations via DWT and FFT. Specifically, we design a Frequency Representation Enhancement (FRE) module using the Discrete Wavelet Transform (DWT) and incorporating a multigranularity enhancement strategy that amplifies all subbands across high frequency to improve discriminability. Additionally, we introduce a Frequency Representation Consistency (FRC) module, which employs complex convolution to capture and preserve forgery patterns in the real and imaginary components derived from FFT. By integrating complementary frequency representations from the DWT and FFT domains obtained through the FRE and FRC modules, MFDL achieves a comprehensive understanding of forgery traces in the frequency domain. This enhances the model’s generalization capability for detecting generated content. Extensive experiments conducted on 32 distinct datasets, covering both GAN-generated and Diffusion-based images, demonstrate the effectiveness of our proposed MFDL framework. These experiments validate the effectiveness of multi-perspective frequency domain learning and show that MFDL outperforms existing detection methods, confirming its strong generalization ability across diverse generative models.
Loading