Keywords: GAN, evaluation, embedding
Abstract: The embeddings from CNNs pretrained on Imagenet classification are de-facto standard image representations for assessing GANs via FID, Precision and Recall measures. Despite broad previous criticism of their usage for non-Imagenet domains, these embeddings are still the top choice in most of the GAN literature. In this paper, we advocate the usage of the state-of-the-art self-supervised representations to evaluate GANs on the established non-Imagenet benchmarks. These representations, typically obtained via contrastive learning, are shown to provide better transfer to new tasks and domains, therefore, can serve as more universal embeddings of natural images. With extensive comparison of the recent GANs on the common datasets, we show that self-supervised representations produce a more reasonable ranking of models in terms of FID/Precision/Recall, while the ranking with classification-pretrained embeddings often can be misleading.
One-sentence Summary: We show that the state-of-the-art self-supervised representations should be used when comparing GANs on the non-Imagenet datasets
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Data: [CelebA-HQ](https://paperswithcode.com/dataset/celeba-hq), [FFHQ](https://paperswithcode.com/dataset/ffhq), [ImageNet](https://paperswithcode.com/dataset/imagenet), [LSUN](https://paperswithcode.com/dataset/lsun)