An Adaptive Entropy Threshold Watermark

An Adaptive Entropy Threshold Watermark

ACL ARR 2025 May Submission6305 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: By embedding and detecting hidden features in text, watermarking algorithms for large language models can effectively identify machine-generated text. However, such embedding leads to a decline in text quality, especially in low-entropy scenarios where performance needs improvement. Methods for determining entropy thresholds based on experimental and historical text strategies require significant computational resources and time, and they exhibit poor adaptability to unknown tasks. In this work, we propose an adaptive entropy threshold watermarking method that automates the determination of thresholds during the generation and detection processes. Specifically, we leverage the entropy distribution characteristics of text sequences generated by large models to identify task-specific entropy properties, thereby calculating entropy thresholds to filter low-entropy segments. This enhances detection capability while maintaining a certain level of code-related text quality. Experiments demonstrate that our method ensures code-related text quality and improves detection performance across diverse text tasks.

Paper Type: Long

Research Area: Ethics, Bias, and Fairness

Research Area Keywords: watermark,large language model,code generation

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: English

Submission Number: 6305

Loading