Abstract: In this paper, we motivate the need to include context in data cleaning in order to account for the subjective nature of data quality. Based on our recent work on incorporating ontologies into Functional Dependencies, we argue that ontologies are a rich source of context, and an effective tool for modeling domain concepts and relationships for data cleaning. Using real datasets, we present examples showing how ontologies can improve data cleaning workflows, and we outline open problems and directions for future work.
Loading