Are Quality Dimensions Correlated? An Empirical Investigation Over Linked Data

Maria Angela Pellegrino, Anisa Rula, Gabriele Tuozzo

Published: 01 Jan 2026, Last Modified: 17 Jan 2026CrossrefEveryoneRevisionsCC BY-SA 4.0

Abstract: Data quality is a complex and multidimensional concept, hierarchically organized into categories, dimensions, and metrics. Although theoretical correlations among quality dimensions have been proposed, they have not yet been empirically validated. This study addresses this gap by systematically investigating whether such correlations hold in practice, comparing theoretical assumptions with empirical, data-driven insights over a five-quarter longitudinal analysis. Leveraging outputs from a freely available quality assessment tool, the findings challenge some prior assumptions, e.g., the relationship between timeliness and accuracy, while uncovering new correlations, including a positive association between interpretability and intrinsic-related dimensions. Through this empirical evidence, this work refines existing data quality models, enhances best practices for dataset management, and informs future efforts to optimize quality assessment frameworks in the Semantic Web.

External IDs:doi:10.1007/978-3-032-09527-5_2