The Robustness-Security Paradox: Channel-Aware Feature Learning for Adversarial Watermark Exploitation

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Image Watermarking, Watermarking Security, AI-generated content detection
Abstract: Watermarking is crucial for establishing provenance and detecting AI-generated content. While current approaches prioritize robustness against real-world distortions, we explore how the robustness-security tradeoff manifests in deep learning-based watermarks: robust watermarks necessarily increase the redundancy of detectable watermark patterns embedded in images, creating exploitable information leakage. Leveraging this insight, we introduce an attack framework that extracts watermark pattern leakage through multi-channel feature learning using pre-trained vision models. Unlike previous approaches that require extensive data or detector access, our method achieves both watermark removal (detection evasion) and watermark forgery attacks with just a single watermarked image in a no-box setting. Extensive experiments demonstrate our method outperforms state-of-the-art techniques by 74\% in detection evasion rate and 47\% in forgery accuracy, while preserving visual quality.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 12929
Loading