Keywords: Large Language Models(LLMs);Copyright infringement detection;
Abstract: Large Language Models (LLMs) are trained on large datasets that may contain copyrighted material, leading to risks of data infringement. Existing detection methods usually work at the word level or rely on a single feature, such as lexical similarity or surface overlap. However, LLMs can reproduce copyrighted content through semantic rewriting or logical transfer, which makes these methods less effective. Therefore, we propose TRIDENT, a \textbf{T}h\textbf{R}ee-D\textbf{I}mensional Method for \textbf{D}ata Copyright Infringem\textbf{ENT} Detection in LLMs. TRIDENT combines three dimensional features: surface features, semantic relevance, and quality assessment, tackling both explicit replication and implicit infringement situations. Specifically, it utilizes a statistics-based method to provide interpretable significance verification, and a learning-based method to enable efficient automated detection. Comparison with the state-of-the-art method on GPT2-XL and Deepseek-7B show that TRIDENT reduces the false positive rate from $44.85\%$ to $0.25\%$ and increases the true positive rate from $14.65\%$ to $99.7\%$, achieving an AUC close to $99.9\%$.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 11096
Loading