Abstract: Image watermarks have been considered a promising technique to help detect AI-generated content, which can be used to protect copyright or prevent fake image abuse. In this work, we present a black-box method for removing invisible image watermarks, without the need of any dataset of watermarked images or any knowledge about the watermark system. Our approach is simple to implement: given a single watermarked image, we regress it by deep image prior (DIP). We show that from the intermediate steps of DIP one can reliably find an evasion image that can remove invisible watermarks while preserving high image quality. Due to its unique working mechanism and practical effectiveness, we advocate including DIP as a baseline invasion method for benchmarking the robustness of watermarking systems. Finally, by showing the limited ability of DIP and other existing black-box methods in evading training-based visible watermarks, we discuss the positive implications on the practical use of training-based visible watermarks to prevent misinformation abuse.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: - Expanded Sec. 3.3 `Remarks on DIP-based watermark evasion` to clarify whether watermark evasion methods are reliance on decoders, and referencing Appendix C for DIP evasion runtime analysis.
- Corrected the error of citation in the `Experiment Setup` paragraph in Section 4 --- `Diffuser` and `VAE` are referenced to the correct paper.
- Added Table 5 and corresponding discussion to provide quantitative support over the frequency analysis on DIP watermark evasion.
- Added Section 6 `Ethical Statement`
- Added discussion of `$WIND_{inpainting}$` watermark in Appendix F.
- Added code link in abstract.
- Updated required info for final version: publish date, openreview url
Code: https://github.com/sun-umn/DIP_Watermark_Evasion_TMLR
Assigned Action Editor: ~Chinmay_Hegde1
Submission Number: 4248
Loading