Keywords: Discrete Diffusion Language Models (dLLMs), Watermarking
TL;DR: dMARK embeds robust watermarks in discrete diffusion LLMs by guiding decoding order with a parity key, achieving strong detectability without degrading text quality.
Abstract: We introduce dMARK, the first decoding-guided watermarking method for discrete diffusion language models (dLLMs). Unlike prior approaches that modify token probabilities, dMARK embeds watermark signals by steering the decoding order according to a binary hashing rule that prioritizes tokens whose indices match a target parity, leaving the underlying probability distribution intact. dMARK is broadly compatible with common decoding strategies (e.g., confidence, entropy, and margin-based) and can be further enhanced with beam search. Experiments on multiple dLLMs and benchmark datasets show that dMARK achieves strong detectability
with minimal quality degradation. The watermark also remains robust under post-editing operations, including insertion, deletion, substitution, and paraphrasing, establishing decoding-guided watermarking as a practical solution for dLLMs.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 9925
Loading