Generative Models are Self-Watermarked: Intellectual Property Declaration through Re-Generation

24 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: societal considerations including fairness, safety, privacy
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Watermark, Generative Model, Re-generation, Fixed-point Theory
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: Our work discovers the intrinsic fingerprint in the generative model and proposes an iterative re-generation approach to reinforce such a signal.
Abstract: Protecting intellectual property for generated data has emerged as a critical concern for AI corporations, as machine-generated content proliferates. Reusing generated data without permission poses a formidable barrier to safeguarding the intellectual property tied to these models. The verification of data ownership is further complicated by the use of Machine Learning as a Service (MLaaS), which often operates as a black-box system. Our work is dedicated to detecting data reuse from even an individual sample. In contrast to watermarking techniques that embed additional information as watermark triggers into models or generated content, our approach does not introduce artificial watermarks which may compromise the quality of model outputs. Our investigation reveals the existence of latent fingerprints inherently present within deep learning models. In response, we propose an explainable verification procedure to verify data ownership through re-generation. Furthermore, we introduce a novel methodology to amplify the model fingerprints through iterative data regeneration and a theoretical grounding on the proposed approach. We demonstrate the viability of our approach using recent advanced text and image generative models.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9219
Loading