RAP-SM: Robust Adversarial Prompt via Shadow Models for Copyright Verification of Large Language Models

ACL ARR 2025 May Submission2388 Authors

19 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Advancements in large language models (LLMs) have intensified the need for effective intellectual property (IP) safeguards, with fingerprinting emerging as a key strategy. Existing fingerprint verification approaches are often limited to individual models, thereby inadequately capturing the shared intrinsic properties of related model series. To address this limitation, we propose RAP-SM (Robust Adversarial Prompt via Shadow Models), a novel framework for extracting a public fingerprint applicable to an entire lineage of LLMs. By leveraging shadow models, RAP-SM generates robust adversarial prompts that serve as the basis for this shared fingerprint. Extensive experimental results confirm that RAP-SM successfully distills intrinsic commonalities across diverse models and exhibits significant robustness against adversarial manipulations. This research presents RAP-SM as a promising pathway towards scalable and resilient fingerprint verification, offering improved defenses against potential model misappropriation.
Paper Type: Long
Research Area: Language Modeling
Research Area Keywords: security and privacy
Contribution Types: Theory
Languages Studied: English
Submission Number: 2388
Loading