Keywords: dataset watermarking, dataset ownership verification, data poisoning, backdoor attack
TL;DR: We elaborate a harmless and stealthy targeted data poisoning approach to mark a model trained on a protected dataset and detect it.
Abstract: Dataset ownership verification, the process of determining if a dataset is used in a model's training data, is necessary for detecting unauthorized data usage and data contamination.
Existing approaches, such as backdoor watermarking, rely on inducing a detectable behavior into the trained model on a part of the data distribution.
However, these approaches have limitations, as they can be harmful to the model's performances or require unpractical access to the model's internals.
Most importantly, previous approaches lack guarantee against false positives.\
This paper introduces *data taggants*, a novel non-backdoor dataset ownership verification technique.
Our method uses pairs of out-of-distribution samples and random labels as secret *keys*, and leverages clean-label targeted data poisoning to subtly alter a dataset, so that models trained on it respond to the key samples with the corresponding key labels.
The keys are built as to allow for statistical certificates with black-box access only to the model.\
We validate our approach through comprehensive and realistic experiments on ImageNet1k using ViT and ResNet models with state-of-the-art training recipes.
Our findings demonstrate that data taggants can reliably detect models trained on the protected dataset with high confidence, without compromising validation accuracy, and show their superiority over backdoor watermarking.
We demonstrate the stealthiness and robustness of our method
% shows to be stealthy and robust
against various defense mechanisms.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 11140
Loading