Can Simple Averaging Defeat Modern Watermarks?

Pei Yang; Hai Ci; Yiren Song; Mike Zheng Shou

Can Simple Averaging Defeat Modern Watermarks?

Pei Yang, Hai Ci, Yiren Song, Mike Zheng Shou

Published: 25 Sept 2024, Last Modified: 13 Jan 2025NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Watermark, Security

TL;DR: Many content-agnostic digital watermarking techniques are susceptible to simple steganalysis-based watermark removal.

Abstract: Digital watermarking techniques are crucial for copyright protection and source identification of images, especially in the era of generative AI models. However, many existing watermarking methods, particularly content-agnostic approaches that embed fixed patterns regardless of image content, are vulnerable to steganalysis attacks that can extract and remove the watermark with minimal perceptual distortion. In this work, we categorise watermarking algorithms into content-adaptive and content-agnostic ones, and demonstrate how averaging a collection of watermarked images could reveal the underlying watermark pattern. We then leverage this extracted pattern for effective watermark removal under both greybox and blackbox settings, even when the collection of images contains multiple watermark patterns. For some algorithms like Tree-Ring watermarks, the extracted pattern can also forge convincing watermarks on clean images. Our quantitative and qualitative evaluations across twelve watermarking methods highlight the threat posed by steganalysis to content-agnostic watermarks and the importance of designing watermarking techniques resilient to such analytical attacks. We propose security guidelines calling for using content-adaptive watermarking strategies and performing security evaluation against steganalysis. We also suggest multi-key assignments as potential mitigations against steganalysis vulnerabilities. Github page: \url{https://github.com/showlab/watermark-steganalysis}.

Primary Area: Safety in machine learning

Submission Number: 9663

Loading