Abstract: Large language models have revolutionized text generation, offering significant benefits while also posing threats to society, such as copyright infringement and misinformation. To prevent harmful use, the task of detecting machine-generated content has become an important research topic, though it remains particularly challenging across diverse content domains. This paper presents DGRM, an innovative add-on module designed to improve the domain generalization capability of existing machine-generated text detectors. Our model consists of two training components. (1) Feature disentanglement separates a text’s embedding into target-specific and common attributes, thereby enhancing semantic domain generalization across different content domains. (2) Feature regularization applies constraints to these attributes to extract additional target-relevant information and ensure detection consistency under syntactic perturbations—thus achieving syntactic domain generalization. Evaluation over multiple datasets demonstrates that incorporating our module substantially improves the detection of machine-generated text across semantically and syntactically diverse domains. We hope our work contributes to mitigating the harmful use of language models.
External IDs:dblp:journals/tkde/ParkHC25
Loading