Keywords: Large Language Models (LLMs), Watermarking, Change-point, Epidemic change-point, multiple change-point
Abstract: With the increasing popularity of large language models, concerns over content authenticity have led to the development of myriad watermarking schemes. These schemes can be used to detect a machine-generated text via an appropriate key, while being imperceptible to readers with no such key. The corresponding detection mechanisms usually take the form of statistical hypothesis testing for the existence of watermarks, spurring extensive research in this direction. However, the finer-grained problem of identifying which segments of a mixed-source text are actually watermarked, is much less explored; the existing approaches either lack scalability or theoretical guarantees robust to paraphrase and post-editing. In this work, we introduce an unique perspective to such watermark segmentation problems through the lens of \textit{epidemic change-points}. By highlighting the similarities as well as differences of these two problems, we motivate and propose \texttt{WISER}: a novel, computationally efficient, watermark segmentation algorithm. We validate our algorithm by deriving finite sample error-bounds, and establishing its consistency in detecting multiple watermarked segments in a single text. Complementing these theoretical results, our extensive numerical experiments show that \texttt{WISER} outperforms state-of-the-art baseline methods, both in terms of computational speed as well as accuracy, on various benchmark datasets embedded with diverse watermarking schemes. Our theoretical and empirical findings establish \texttt{WISER} as an effective tool for watermark localization in most general settings. It also demonstrates how the insight into a classical statistical problem can be developed into a theoretically valid and computationally efficient solution of a modern problem.
Supplementary Material: zip
Primary Area: learning theory
Submission Number: 21383
Loading