INFER: Implicit Neural Features for Exposing Realism

ICLR 2026 Conference Submission14679 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Implicit Neural Representations, Deep Fake Detection, Classification
Abstract: Deepfakes pose a significant threat to the authenticity of digital media, with current detection methods often falling short in generalizing to unseen manipulations. INFER is the first deepfake detection framework that leverages Implicit Neural Representations (INRs), marking a new direction in representation learning for forensic analysis. We combine high-level semantic priors from Contrastive Language–Image Pre-training (CLIP) with spatially detailed, frequency-sensitive features from INR-derived heatmaps. While CLIP captures global context grounded in natural image statistics, INR heatmaps expose subtle structural inconsistencies often overlooked by conventional detectors. Crucially, their fusion transforms the feature space in a way that enhances class separability—effectively re-encoding both spatial artifacts and semantic inconsistencies into a more discriminative representation. This complementary integration leads to more robust detection, especially under challenging distribution shifts and unseen forgery types. Extensive experiments on standard deepfake benchmarks demonstrate that our method outperforms existing approaches by a clear margin, highlighting its strong generalization, robustness, and practical utility.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 14679
Loading