High-fidelity Dual-layer Backdoor Watermarking Scheme based on Private Model Embedding

Lingyun Xiang, Fangbo Luo, Xiangli Jin

Published: 2025, Last Modified: 01 Mar 2026TrustCom 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Traditional backdoor watermarking techniques typically rely on injecting trigger samples into the training process, which can lead to model overfitting, distort feature distributions, and compromise decision boundaries, ultimately degrading the performance of the primary task. To address these limitations, we propose DualPrivMark, a dual-layer backdoor watermarking scheme that leverages private model embedding. Instead of directly embedding watermark into the protected model, Dual-PrivMark constructs a dedicated private model that cooperates with the target model, thereby isolating the watermarking task from the original task and preserving task fidelity. Within the private model, we design a two-layer watermarking architecture that combines a label layer for trigger-based verification with a signature layer for image-based authentication, enabling both high-fidelity embedding and enhanced robustness. Furthermore, a multi-task learning strategy is employed to jointly optimize the original and watermarking tasks, ensuring high accuracy across both. Experimental results demonstrate that, compared with existing methods, DualPrivMark achieves reliable watermark embedding while substantially improving model fidelity, and providing strong resistance against ambiguity attacks.

External IDs:dblp:conf/trustcom/XiangLJ25