Keywords: watermarking, attacks, vision
Abstract: The growing use of generative models has intensified the need for watermarking methods that ensure content attribution and provenance. While recent semantic watermarking schemes improve robustness by embedding signals in latent or frequency representations, we show they remain vulnerable even under resource constrained adversarial settings. We present \textsc{DAWN}, a training-free, single-image attack that removes or weakens watermarks without access to the underlying model. By projecting watermarked images onto natural priors across complementary representations, \textsc{DAWN} suppresses watermark signals while preserving visual fidelity. Experiments across diverse watermarking schemes demonstrate that our approach consistently reduces watermark detectability, revealing fundamental weaknesses in current designs. Our code is available at \url{https://anonymous.4open.science/r/DAWN-567A/}
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 4215
Loading