Where Am I From? Identifying Origin of LLM-generated Content

ACL ARR 2024 June Submission3654 Authors

16 Jun 2024 (modified: 04 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Generative models, particularly Large Language Models, have demonstrated remarkable proficiency in producing natural and high-quality content. However, the widespread use of such models raises significant concerns related to copyright, privacy, and security vulnerabilities associated with AI-generated material. In response to these concerns, our objective is to develop digital forensics methods tailored for large language model to trace the generator given a AI-generated content. Our methodology begins with the incorporation of a secret watermark into the generated output, facilitating traceability without necessitating model retraining. To enhance effectiveness, especially in scenarios involving short outputs, we introduce a depth watermark. This framework ensures the traceability of content back to its original source, achieving both accurate tracing and the generation of high-quality output. Extensive experiments have been conducted across diverse settings and datasets to validate the effectiveness and robustness of our proposed framework.
Paper Type: Long
Research Area: Machine Learning for NLP
Research Area Keywords: Security and privacy
Languages Studied: English
Submission Number: 3654