Abstract: Recent advances in Large Language Models (LLMs) have raised urgent concerns about LLM-generated text authenticity, prompting regulatory demands for reliable identification mechanisms.
Although watermarking offers a promising solution, existing approaches struggle to simultaneously achieve three critical requirements: text quality preservation, model-agnostic detection, and message embedding capacity, which are crucial for practical implementation.
To achieve these goals, the key challenge lies in balancing the trade-off between text quality preservation and message embedding capacity.
To address this challenge, we propose BiMark, a novel watermarking framework that achieves these requirements through three key innovations:
(1) a bit-flip unbiased reweighting mechanism enabling model-agnostic detection, (2) a multilayer architecture enhancing detectability without compromising generation quality, and (3) an information encoding approach supporting multi-bit watermarking.
Through theoretical analysis and extensive experiments, we validate that,
compared to state-of-the-art multi-bit watermarking methods, BiMark achieves up to 30\% higher extraction rates for short texts while maintaining text quality indicated by lower perplexity, and performs comparably to non-watermarked text on downstream tasks such as summarization and translation.
Lay Summary: How to trace AI-generated text for ensuring responsible AI applications? We want to address this issue by embedding traceable information into AI-generated text, like a digital fingerprint. The challenge is maintaining the quality of generated text while embedding as much information as possible.
To achieve this goal, we developed a method called BiMark that slightly influences how AI chooses words during text generation. Traceable information can be encoded into text generation by using a fair coin flip mechanism. The hidden information can be reliably extracted later by analyzing how candidate words are distributed.
Our method enables more trustworthy AI applications by providing a way to verify the authenticity and origin of AI-generated content, helping combat misinformation and protect intellectual property.
Primary Area: Social Aspects->Security
Keywords: Language Model Watermarking, Text Generation, Model Security
Submission Number: 14283
Loading