Fine-Grained Provenance Collection over Scripts Through Program Slicing

João Felipe Pimentel, Juliana Freire, Leonardo Murta, Vanessa Braganholo

Published: 2016, Last Modified: 27 Jan 2026IPAW 2016EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Collecting provenance from scripts is often useful for scientists to explain and reproduce their scientific experiments. However, most existing automatic approaches capture provenance at coarse-grain, for example, the trace of user-defined functions. These approaches lack information of variable dependencies. Without this information, users may struggle to identify which functions really influenced the results, leading to the creation of false-positive provenance links. To address this problem, we propose an approach that uses dynamic program slicing for gathering provenance of Python scripts. By capturing dependencies among variables, it is possible to expose execution paths inside functions and, consequently, to create a provenance graph that accurately represents the function activations and the results they affect.

External IDs:dblp:conf/ipaw/PimentelFMB16