Abstract: Developing algorithms to differentiate between machine-generated texts and human-written texts has garnered substantial attention in recent years. Existing methods in this direction typically concern an offline setting where a dataset containing a mix of real and machine-generated texts is given upfront, and the task is to determine whether each sample in the dataset is from a large language model (LLM) or a human. However, in many practical scenarios, sources such as news websites, social media accounts, and online forums publish content in a streaming fashion. Therefore, in this online scenario, how to quickly and accurately determine whether the source is an LLM with strong statistical guarantees is crucial for these media or platforms to function effectively and prevent the spread of misinformation and other potential misuse of LLMs. To tackle the problem of online detection, we develop an algorithm based on the techniques of sequential hypothesis testing by betting that not only builds upon and complements existing offline detection techniques but also enjoys statistical guarantees, which include a controlled false positive rate and the expected time to correctly identify a source as an LLM. Experiments were conducted to demonstrate the effectiveness of our method.
Lay Summary: Large language models (LLMs) can produce content with qualities on par with human-level writing. While powerful, they may also be misused to spread fake news or manipulate public opinion—especially on online platforms like news websites or social media, where texts arrive sequentially. How can we detect whether a stream of texts is from an LLM or a human, and do so quickly and reliably?
Many existing detectors classify individual texts effectively but are not designed for online settings, and some require threshold tuning. A naive adaptation—flagging the source as an LLM once any text is detected as such—can lead to high false positive rates as the number of texts becomes sufficiently large. We frame online LLM detection as a sequential hypothesis testing problem and address it using a method that builds on existing detectors with a game-theoretic approach called testing by betting.
Our method offers rigorous statistical guarantees and does not assume any specific form for the underlying data distribution. It can be combined with existing detectors to help platforms achieve fast and robust detection of LLM sources in online settings.
Link To Code: https://github.com/canchen-cc/online-llm-detection
Primary Area: General Machine Learning->Everything Else
Keywords: LLM-Generated Text, Text Source Detection, Sequential Hypothesis Testing, Online Optimization
Submission Number: 14256
Loading