Identifying the Leak Sources of Hard Copy Documents

Published: 01 Jan 2022, Last Modified: 15 Apr 2025IFIP Int. Conf. Digital Forensics 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Technological advancements have made it possible to use relatively inexpensive hardware and software to replicate and leak sensitive documents. This chapter proposes a novel canary trap method for determining the source of a leaked hard copy document. The method generates self-identifying documents that secretly encode unique information about the individuals who receive them by modifying the inter-word spacing in the original reference document. The encoded information is robust to changes introduced during printing, scanning and copying, rendering the method useful for hard copy as well as digital documents. Due to the lack of publicly-available datasets, a custom hard copy document leakage dataset comprising 100 scanned self-identifying documents encoded at four levels of robustness was created. The hard copy document leakage dataset was subsequently employed to evaluate the performance of the canary trap leak detection method.
Loading