Interpretable Boundary-based Watermark Up to the condition of Lov\'asz Local Lemma

25 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Watermark, Model extraction attacks, Intellectual property protection
Abstract: Watermarking techniques have emerged as pivotal safeguards to defend the intellectual property of deep neural networks against model extraction attacks. Most existing watermarking methods rely on the identification of samples within randomly selected trigger sets. However, this paradigm is inevitably disrupted by the ambiguous points that exhibit poor discriminability, thus leading to the misidentification between benign and stolen models. To tackle this issue, in this paper, we propose a boundary-based watermarking method that enhances the discernibility of trigger set, further improving the ability in distinguish benign and stolen models. Specifically, we select trigger samples on the decision boundary of base model and assigned them labels with the least probabilities, while providing a tight bound based on the Lov\'asz Local Lemma. This approach ensures the watermark's reliability in identifying stolen models by improving discriminability of trigger samples. Meanwhile, we provide theoretical proof to demonstrate that the watermark can be effectively guaranteed under the constraints guided by the Lov\'asz Local Lemma. Experimental results demonstrate that our method outperforms the state-of-the-art watermarking methods on CIFAR-10, CIFAR-100 and ImageNet datasets. Code and data will be released publicly upon the paper acceptance.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4231
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview