Bypassing the Decoding: Detecting Copyright Infringement through LLM Internal States

Bypassing the Decoding: Detecting Copyright Infringement through LLM Internal States

ACL ARR 2025 February Submission5525 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large Language Models (LLMs) have revolutionized Natural Language Processing through advanced text generation capabilities. However, their use raises legal and ethical concerns, particularly related to copyright infringement. While traditional methods assess the entire generated output for potential violations, this study introduces a novel framework that detects copyright risks by analyzing LLMs' internal states before any text is generated. This proactive approach enhances efficiency by identifying issues early in the generation process. To implement this framework, we used a dataset of literary works to derive both the LLMs' internal states and reference materials. These were used to train a neural network classifier capable of detecting potential copyright concerns. Additionally, this method helps prevent the unintended release of copyrighted content, offering an extra layer of protection. We also integrated this framework into a Retrieval-Augmented Generation (RAG) system, using FAISS (Facebook AI Similarity Search) and SQLite to efficiently manage reference texts. These texts are sourced from a protected copyright database, improving the accuracy and reliability of our detection process. By comparing generated content to known copyrighted material, our system ensures better compliance with legal and ethical standards. Overall, our findings demonstrate the value of analyzing internal states for proactive copyright monitoring, providing a scalable and effective solution for responsible AI-driven text generation.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: Large Language Model, Internal States, Copyright Infringement

Contribution Types: NLP engineering experiment, Theory

Languages Studied: English

Submission Number: 5525

Loading