RAP-SM: Robust Adversarial Prompt via Shadow Models for Copyright Verification of Large Language Models

RAP-SM: Robust Adversarial Prompt via Shadow Models for Copyright Verification of Large Language Models

ACL ARR 2025 May Submission2388 Authors

19 May 2025 (modified: 29 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Advancements in large language models (LLMs) have intensified the need for effective intellectual property (IP) safeguards, with fingerprinting emerging as a key strategy. Existing fingerprint verification approaches are often limited to individual models, thereby inadequately capturing the shared intrinsic properties of related model series. To address this limitation, we propose RAP-SM (Robust Adversarial Prompt via Shadow Models), a novel framework for extracting a public fingerprint applicable to an entire lineage of LLMs. By leveraging shadow models, RAP-SM generates robust adversarial prompts that serve as the basis for this shared fingerprint. Extensive experimental results confirm that RAP-SM successfully distills intrinsic commonalities across diverse models and exhibits significant robustness against adversarial manipulations. This research presents RAP-SM as a promising pathway towards scalable and resilient fingerprint verification, offering improved defenses against potential model misappropriation.

Paper Type: Long

Research Area: Language Modeling

Research Area Keywords: security and privacy

Contribution Types: Theory

Languages Studied: English

Submission Number: 2388

Loading