A Little Leak Will Sink a Great Ship: Survey of Transparency for Large Language Models from Start to Finish

ACL ARR 2024 June Submission4311 Authors

16 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large Language Models (LLMs) are trained on massive web-crawled corpora. An increasing issue is LLMs generating content based on leaked data, and the need to detect and suppress such generated results, including personal information, copyrighted text, and benchmark datasets. A fundamental cause of this issue is leaked data in the training dataset. However, existing research has not sufficiently clarified the relationship between leaked instances in the training data and the ease of output and detection of these leaked instances by LLMs. In this paper, we conduct an experimental survey to elucidate the relationship between the rate of leaked instances in the training dataset and the generation and detection of LLMs in relation to the leakage of personal information, copyrighted texts, and benchmark data. Our experiments reveal that LLMs generate leaked information in most cases despite there being little such data in the training set. Furthermore, the lower the rate of leaked instances, the more difficult it becomes to detect the leakage. When addressing the leakage problem in the training dataset, we must be careful as reducing leakage instances does not necessarily lead to only positive effects. Finally, we demonstrate that explicitly defining the leakage detection task using examples in LLMs can help mitigate the impact of the rate of leakage instances in the training data on detection.
Paper Type: Long
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: Large language models, leakage detection
Contribution Types: Model analysis & interpretability, Data analysis
Languages Studied: English
Submission Number: 4311
Loading