Strong Copyright Protection for Language Models via Adaptive Model Fusion

Published: 03 Jul 2024, Last Modified: 12 Jul 2024ICML 2024 FM-Wild Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: language models, copyright, model fusion, safety, reliability
TL;DR: We propose CP-LLM, an algorithm that adaptively combines language models to minimize the reproduction of copyright-protected materials
Abstract: The risk of language models unintentionally reproducing copyrighted material from their training data has motivated the development of various protective measures. Simultaneously, model fusion has emerged as a promising approach for combining language models, although its potential for copyright protection remains unexplored. In this paper, we demonstrate that model fusion offers an effective solution for copyright protection for language models. Specifically, we propose CP-LLM, an algorithm that adaptively combines language models to minimize the reproduction of protected materials. We show that CP-LLM satisfies the recently proposed near-access free (NAF) guarantees while also fulfilling a desirable balancing property to prevent copyright infringement. Our results demonstrate that CP-LLM significantly reduces the memorization of copyrighted content while maintaining high-quality text generation.
Submission Number: 71
Loading