DiffKGW: Stealthy and Robust Diffusion Model Watermarking

Tianxin Wei; Ruizhong Qiu; Yifan Chen; Yunzhe Qi; Jiacheng Lin; Wenxuan Bao; Wenju Xu; Sreyashi Nag; Ruirui Li; Hanqing Lu; Zhengyang Wang; Chen Luo; Hui Liu; Suhang Wang; Jingrui He; Qi He; Xianfeng Tang

DiffKGW: Stealthy and Robust Diffusion Model Watermarking

Tianxin Wei, Ruizhong Qiu, Yifan Chen, Yunzhe Qi, Jiacheng Lin, Wenxuan Bao, Wenju Xu, Sreyashi Nag, Ruirui Li, Hanqing Lu, Zhengyang Wang, Chen Luo, Hui Liu, Suhang Wang, Jingrui He, Qi He, Xianfeng Tang

Published: 22 Feb 2026, Last Modified: 22 Feb 2026Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Diffusion models are known for their supreme capability to generate realistic images. However, ethical concerns, such as copyright protection and the generation of inappropriate content, pose significant challenges for the practical deployment of diffusion models. Recent work has proposed a flurry of watermarking techniques that inject artificial patterns into initial latent representations of diffusion models, offering a promising solution to these issues. However, enforcing a specific pattern on selected elements can disrupt the Gaussian distribution of the initial latent representation. Inspired by watermarks for large language models (LLMs), we generalize the LLM KGW watermark to image diffusion models and propose a stealthy probability adjustment approach DiffKGW that preserves the Gaussian distribution of initial latent representation. In addition, we dissect the design principles of state-of-the-art watermarking techniques and introduce a unified framework. We identify a set of dimensions that explain the manipulation enforced by watermarking methods, including the distribution of individual elements, the specification of watermark shapes within each channel, and the choice of channels for watermark embedding. Through the empirical studies on regular text-to-image applications and the first systematic attempt at watermarking image-to-image diffusion models, we thoroughly verify the effectiveness of our proposed framework through comprehensive evaluations. On all the diffusion models, including Stable Diffusion, our approach induced from the proposed framework not only preserves image quality but also outperforms existing methods in robustness against a wide range of attacks.

Submission Type: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Ju_Sun1

Submission Number: 6280

Loading