What if Retrieval Could Work Before Decoding? The case of JPEG AI Latents for Deepfake Source Attribution

Published: 26 Oct 2025, Last Modified: 12 Nov 20251st Deepfake Forensics Workshop: Detection, Attribution, Recognition, and Adversarial Challenges in the Era of AI-Generated MediaEveryoneRevisionsWM2024 Conference
Abstract: We explore whether the latent space of the recent JPEG AI compression standard can be employed for high-level semantic tasks. Specifically, we propose a decoding-free approach to image-to-image retrieval and deepfake generator attribution that operates directly on JPEG AI latents, using simple global average pooling and cosine similarity, without any training or learned parameters. Our experiments show that these latent representations retain both semantic content and generator-specific signatures. Using only two latents per image, we achieve consistent mean Top-1 accuracy across eight retrieval classes and high attribution performance in a multi-generator deepfake setting. Compared to traditional RGB-based pipelines, our method eliminates synthesis transformation and color post-processing, yielding substantial efficiency gains. We argue that compressed-domain semantic indexing may play a central role in large-scale generative content monitoring, assuming that appropriate transparency and user consent mechanisms are implemented.
Loading