In-Context Alignment: Resolving Representation Conflict for Parameter-Efficient Forgery Detection in Vision Models

Jiafeng Li; Kai Li; Lianghua He; Ying Wen

In-Context Alignment: Resolving Representation Conflict for Parameter-Efficient Forgery Detection in Vision Models

Jiafeng Li, Kai Li, Lianghua He, Ying Wen

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: representation conflict, prompt learning, visual forensics, parameter-efficient learning

Abstract: Vision Foundation Models (VFMs) have shown remarkable potential in image forensics, yet their content-driven representations often suppress the subtle forensic cues essential for manipulation localization, thereby exhibiting an inherent representation conflict. Conventional full fine-tuning struggles to address this conflict, as it demands extensive parameter updates, risks overfitting, and erodes prior knowledge, leading to poor generalization in diverse forgery detection scenarios. We propose In-Context Alignment (ICA), a parameter-efficient framework that reframes forgery localization as a visual in-context learning task. ICA introduces two complementary prompting mechanisms within frozen VFMs: a Physical-Aware Prompter (PAP) that enhances suppressed low-level forensic signals such as noise and frequency artifacts via a Mixture-of-Experts for adaptive fusion, and a Semantic-Aware Prompter (SAP) that encourages the model to expose semantic inconsistencies in high-level features. With only a small fraction of parameters updated, ICA achieves strong performance across diverse image forgery localization benchmarks and can even compete with fully fine-tuned models. Our results demonstrate that in-context alignment of semantic and forensic representations offers a scalable, robust, and efficient paradigm for advancing visual forensics.

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 11007

Loading