Abstract: Recent developments in natural language generation have made it possible to generate fluent articles automatically. If it is maliciously used to mislead the public, it may cause potential social risks. To avoid these risks, building automatic discriminators for detecting machine-generated text is required. However, in real-world situations, it is hard for humans to identify the machine-generated text, which causes the collection of machine-generated text to be difficult, and discriminators can only be trained on insufficient data. Also, it’s hard to generate synthetic machine data ourselves because we are unable to know the masking strategy of collected machine-generated text in real-world situations. In this paper, we found that even if there is a small amount of training data, the saliency score computed by the trained discriminator can reveal the masking strategy of the machine-generated text in the training set. Based on this observation, we propose a data augmentation method, CopyCAT. CopyCAT can mimic the masking strategy of the collected machine data by the information revealed by the saliency score. Our experiments show that the discriminator trained with our augmented data can have up to 10% accuracy gain.
Loading