Revisiting Coding-Based Approaches to Overcome the Curse of Dimensionality in Learning-Based Watermarking
TL;DR: Revisiting Coding-Based Approaches to Overcome the Curse of Dimensionality in Learning-Based Watermarking
Abstract: Deep learning–based watermarking has substantially improved robustness to real-world noise, but its performance degrades as the payload dimension increases. In contrast, coding-based methods such as quantization index modulation (QIM) do not suffer from this curse of dimensionality, although they are less robust to real-world noise. To leverage the strengths of both approaches, we propose OrthoMark, a framework that decouples robust feature extraction from message encoding. OrthoMark first learns a distortion-invariant feature representation using a deep robust feature extractor, and then performs watermark encoding and decoding in this feature domain using coding-based methods. Extensive experiments demonstrate that OrthoMark significantly improves the trade-off among visual quality, robustness, and capacity compared to prior deep watermarking methods, with particularly large gains in the high capacity regime, effectively overcoming the curse of dimensionality. Our code is available at \url{https://github.com/QQiuyp/OrthoMark}.
Lay Summary: When photos travel across the internet, we often need to know where they came from: who took the photo, whether it is real or AI-generated, or whether it was used without permission. One way to answer these questions is image watermarking: hiding invisible information inside the pixels, which can later be read back to identify the image's source.
A good watermark needs to be invisible to the eye, survive everyday changes like compression and cropping, and carry as much information as possible. Existing methods struggle to deliver all three. Older coding-based methods can hide a lot of information, but break when images are edited. Newer AI methods handle edits much better, but fail in a surprising way: as we try to hide more information, they suddenly cannot read it back, even from a clean image. We found this happens because the AI's hidden patterns start overlapping and interfering with each other.
We propose OrthoMark, which combines the strengths of both. An AI module finds image features that stay stable under edits. The coding-based module then writes the information using patterns designed not to overlap. OrthoMark hides very long bits in a single image while staying invisible and surviving real-world edits, making it useful for protecting copyright, tracking AI-generated images, and verifying where photos came from.
Originally Submitted Supplementary Material: zip
Link To Code: https://github.com/QQiuyp/OrthoMark
Primary Area: Applications->Computer Vision
Keywords: Deep watermarking
Originally Submitted PDF: pdf
Submission Number: 9170
Loading