- Keywords: global image texture, generative adversarial networks, Gram matrix
- TL;DR: An empirical study on fake images reveals that texture is an important cue that current fake images differ from real images. Our improved model capturing global texture statistics shows better cross-GAN fake image detection performance.
- Abstract: Now GANs can generate more and more realistic face images that can easily fool human beings. In contrast, a common convolutional neural network(CNN), e.g. ResNet-18, can achieve more than 99.9% accuracy in discerning fake/real faces if training and testing faces are from the same source. In this paper, we performed both human studies and CNN experiments, which led us to two important findings. One finding is that the textures of fake faces are substantially different from real ones. CNNs can capture local image texture information for recognizing fake/real face, while such cues are easily overlooked by humans. The other finding is that global image texture information is more robust to image editing and generalizable to fake faces from different GANs and datasets. Based on the above findings, we propose a novel architecture coined as Gram-Net, which incorporates “Gram Block” in multiple semantic levels to extract global image texture representations. Experimental results demonstrate that our Gram-Net performs better than existing approaches for fake face detection. Especially, our Gram-Net is more robust to image editing, e.g. downsampling, JPEG compression, blur, and noise. More importantly, our Gram-Net generalizes significantly better in detecting fake faces from GAN models not seen in the training phase.