Reconstructing Provenance

Published: 2012, Last Modified: 22 Dec 2024ISWC (2) 2012EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Provenance is an increasingly important aspect of data management that is often underestimated and neglected by practitioners. In our work, we target the problem of reconstructing provenance of files in a shared folder setting, assuming that only standard filesystem metadata are available. We propose a content-based approach that is able to reconstruct provenance automatically, leveraging several similarity measures and edit distance algorithms, adapting and integrating them into a multi-signal pipeline. We discuss our research methodology and show some promising preliminary results.
Loading