WaterSearch: A Quality-Aware Search-based Watermarking Framework for Large Language Models

ACL ARR 2026 January Submission9300 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: text watermark, watermark detection, large language models
Abstract: Watermarking safeguards the accountability and trust of LLM-generated text by embedding identifiable signals for reliable attribution. Existing methods typically manipulate token generation probabilities to embed signals, with detection performed via corresponding statistical metrics. Despite their effectiveness, these methods inherently face a trade-off between detectability and text quality: the signal strength and randomness required for robust watermarking tend to degrade downstream task performance. This paper designs a novel embedding scheme that controls seed pools for diverse parallel generation of watermarked text and proposes \textbf{WaterSearch}, a sentence-level, search-based watermarking framework adaptable to existing methods. WaterSearch enhances text quality by jointly optimizing two key aspects: (1) distribution fidelity and (2) watermark detection significance. We evaluate our method on three popular LLMs across ten diverse tasks. Extensive experiments show that our method consistently outperforms baselines, achieving an average improvement of 51.01\% and substantial gains in challenging settings such as short-text and low-entropy generation, with improvements of 47.78\% and 36.47\%, respectively. Moreover, under attack scenarios including token perturbation and paraphrase attacks, WaterSearch maintains high detectability, further validating its robustness against attacks.
Paper Type: Long
Research Area: Language Models
Research Area Keywords: robustness, safety and alignment, security and privacy
Languages Studied: English
Submission Number: 9300
Loading