Abstract: The rapid development of large language models (LLMs) has driven significant advancements in various applications. However, the intellectual property of these models often faces risks due to unauthorized reproduction or encapsulation by third parties. In this paper, we propose EasyDetector, a novel approach to detect the provenance of LLMs using linear probes. Our method aims to identify the original source model, even if it has been fine-tuned or encapsulated into another model. Specifically, EasyDetector performs classification on the intermediate layer representations of the new model using linear probes of the original model. Models from the same source exhibit high accuracy, while models from different sources yield low accuracy. Extensive experiments on diverse LLMs demonstrate the effectiveness of EasyDetector in detecting model provenance. The proposed method is lightweight and applicable to various model architectures, holding significant importance for protecting the intellectual property of LLMs.
Loading