Abstract: Concern about the propagation of fake news and data grow every day. At the same time, the analysis and mining of large data sets have become essential in the decision-making processes, in which incorrect data bring misleading insights. However, there is no general-purpose, automatic mechanism to effectively ensure the consistency of temporal data in Linked Data sources such as Wikidata, Wikimedia's free knowledge base. In this paper, an approach based on cross-comparing date values to discover inconsistent data is proposed. Besides, the concept of contemporary constraint, on which this approach is based, is defined and formalized in order to show how to find inconsistencies in a wide range of data sources. Our experimental results show that contemporary constraints are effective and can be used with multiple purposes for data curation and data quality analysis. As a success story, the contemporary constraint has been implemented in Wikidata.
0 Replies
Loading