Keywords: Tampering Detection, Vision Encoder
Abstract: Encoder-as-a-Service (EaaS) enables pre-trained encoders to be shared across tasks, reducing cost but introducing integrity risks when models are modified without notice. Detecting tampering is difficult under a strict black-box setting, where the encoder is hidden within unknown pipelines and only application outputs are observable. Existing fingerprinting methods fail under these conditions as they require model predictions or task-specific information.
We present a novel fingerprinting framework for black-box encoder verification, \emph{grounded in a theoretical insight that larger embedding divergence increases the likelihood of downstream output differences}. Building on this principle, we construct \emph{fingerprint twins}—paired inputs that produce nearly identical embeddings on an intact encoder but diverge sharply after tampering. We simulate realistic changes using \emph{importance-aware perturbations} and optimize twins to maximize KL divergence while constraining perturbations within an $\epsilon$-ball for natural appearance. Experiments across datasets and encoder types demonstrate reliable, task-agnostic detection with negligible impact on utility.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 24311
Loading