RINTAW: A Robust Invisible Watermark for Tabular Generative Models

Published: 06 Mar 2025, Last Modified: 16 Apr 2025WMARK@ICLR2025EveryoneRevisionsBibTeXCC BY 4.0
Track: long paper (up to 9 pages)
Keywords: Tabular watermark
Abstract: Watermarking tabular generative models is critical for preventing misuse of synthetic tabular data. However, existing watermarking methods for tabular data often lack robustness against common attacks (e.g., row shuffling) or are limited to specific data types (e.g., numerical), restricting their practical utility. To address these challenges, we propose \modelname, a novel watermarking framework for tabular generative models that is robust to common attacks while preserving data fidelity. \modelname embeds watermarks by leveraging a subset of column values as seeds. To ensure the pseudorandomness of the watermark key, \modelname employs an adaptive column selection strategy and a masking mechanism to enforce distribution uniformity. This approach guarantees minimal distortion to the original data distribution and is compatible with any tabular data format (numerical, categorical, or mixed) and generative model architecture. We validate \modelname on six real-world tabular datasets, demonstrating that the quality of watermarked tables remains nearly indistinguishable from non-watermarked ones while achieving high detectability even under strong post-editing attacks. The code is available at this \href{https://github.com/fangliancheng/RINTAW}{link}.
Presenter: ~Aiwei_Liu1
Format: Yes, the presenting author will definitely attend in person because they are attending ICLR for other complementary reasons.
Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding would significantly impact their ability to attend the workshop in person.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 46
Loading