Latents-Inv:Robust Semantic Watermark with Key-Assisted Recovery for diffusion models

20 Sept 2025 (modified: 03 Dec 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: watermark, AI Security, diffusion model
Abstract: Semantic watermarking provides imperceptible identity traceability for diffusion-generated images, enabling model copyright protection and image source verification. However, existing semantic watermarking methods based on initial latent noise render the protected image vulnerable to adversarial latent-space manipulations, such as black-box forgery via proxy models and watermark-pattern-removal attacks that exploit statistical regularities. In this paper, we propose a robust watermarking framework resilient diverse adversarial manipulation attack. Specifically, we design a fully reversible, flow-based codec with dual encoding paths, allowing plug-and-play integration into the diffusion generation process across architectures (UNet and MMDiT). The dual-output network encodes watermark information into both the carrier image and the owner’s secret key, enabling recovery of removal attacked watermark via key-assisted reconstruction. To guarantee verification reliability without excessive reliance on the key while retaining the ability to detect forged watermarked images, we propose a joint-training strategy that leverages negative-sample pairs under both accuracy and fidelity constraints. Furthermore, we introduce an Euler-based enhanced solver for the effective inversion in rectified flow models, which improves the accuracy of watermark information recovered. Experimental results show that our method achieves superior robustness under various attacks while maintaining high visual quality across diverse models.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 25446
Loading