Decomposition then watermarking: Enhancing code traceability with dual-channel code watermarking

Haibo Lin, Zhong Li, Ruihua Ji, Minxue Pan, Tian Zhang, Nan Wu, Xuandong Li

Published: 2026, Last Modified: 08 May 2026Autom. Softw. Eng. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Code watermarking has gained increasing attention for tracing the provenance of code with the rapid growth of the open-source community. Existing work on code watermarking has shown promising results yet still falls short, especially when a multi-bit watermark for encoding diverse information is required. In this paper, we propose DWC, a novel code watermarking method with highly watermark capacity. The key idea of DWC is to first decompose the code into natural and formal channels, then embed the watermark separately into each channel based solely on its respective information. As such, DWC reduces the mutual interference between these two channels and the impacts of irrelevant information within the code, thus enabling more effective transformations for embedding watermarks with higher capacity and robustness. Our extensive experiments on source code snippets in four programming languages (C, C++, Java, and Python) demonstrate the effectiveness, efficiency, and capability of DWC in embedding multi-bit watermarks, as well as the utility and robustness of the watermarked code it generates.
Loading