CATMark: A Context-Aware Thresholding Framework for Robust Cross-Task Watermarking in Large Language Models

17 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: watermark, large language model, cross-task
Abstract: Watermarking algorithms for Large Language Models (LLMs) effectively identify machine-generated content by embedding and detecting hidden statistical features in text. However, such embedding leads to a decline in text quality, especially in low-entropy scenarios where performance needs improvement. Existing methods that rely on entropy thresholds often require significant computational resources for tuning and demonstrate poor adaptability to unknown or cross-task generation scenarios. We propose $\textbf{C}$ontext-$\textbf{A}$ware $\textbf{T}$hreshold watermarking (CATMark), a novel framework that dynamically adjusts watermarking intensity based on real-time semantic context. CATMark partitions text generation into semantic states using logits clustering, establishing context-aware entropy thresholds that preserve fidelity in structured content while embedding robust watermarks. Crucially, it requires no pre-defined thresholds or task-specific tuning. Experiments show CATMark improves text quality in cross-tasks without sacrificing detection accuracy.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 9264
Loading