Can Your Classifier Detect Boundaries? Adaptation of Artificial Text Detection Methods for the Real Or Fake Text Challenge

Anonymous

Can Your Classifier Detect Boundaries? Adaptation of Artificial Text Detection Methods for the Real Or Fake Text Challenge

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone

TL;DR: We study robustness of topological data analysis-based methods, perplexity-based methods and LM-based detector on artificial text boundary detection task in cross-domain and cross-model setting.

Abstract: Due to the rapid development of text generation models, people increasingly often encounter texts that may start out as written by a human but then continue as AI-generated. Detecting the boundary between human-written and machine-generated parts of such texts is a very challenging problem that has not received much attention in literature. We consider a number of different approaches for artificial text boundary detection, comparing predictors over features of different nature. We show that supervised fine-tuning of the RoBERTa model works well for in-domain detection of a single LLM but fails to generalize in important cross-domain and cross-generator settings, demonstrating a tendency to overfit to spurious features of the data. Then, we adapt perplexity-based approaches and propose novel algorithms based on features extracted from a frozen LLM's embeddings. We show that these approaches outperform the human accuracy level on an extremely hard Real or Fake Text benchmark. Analyzing the robustness of our approaches in cross-domain and cross-model settings, we discover important properties of the data that can hinder the performance of artificial text boundary detection algorithms.

Paper Type: long

Research Area: Interpretability and Analysis of Models for NLP

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data analysis

Languages Studied: English

0 Replies

Loading