Keywords: Diffusion Large Language Models, Watermark, Text Generation
Abstract: Watermarking techniques enable the embedding of imperceptible signals into autoregressive large language models (LLMs), facilitating reliable detection and attribution of AI-generated text. However, for the emerging paradigm of diffusion large language models (dLLMs), which provide bidirectional context modeling, greater generation flexibility and controllability, and more efficient sampling, research on watermarking remains largely unexplored, raising critical concerns about copyright protection. In this work, we present the first systematic investigation of watermarking for dLLMs and introduce Ripple, a dedicated framework specifically designed for the diffusion-based generation process. The Ripple operates in two complementary stages, namely watermark injection and watermark calibration, to achieve seamless integration of watermark signals within dLLMs. Furthermore, extensive experiments on two representative dLLMs across multiple datasets demonstrate that Ripple effectively balances detectability, robustness, and text quality. We believe this study provides a solid foundation for advancing future research on watermarking in dLLMs.
Supplementary Material: zip
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 8499
Loading