Predicting incorrect mappings: a data-driven approach applied to DBpedia

Mariano Rico, Nandana Mihindukulasooriya, Dimitris Kontokostas, Heiko Paulheim, Sebastian Hellmann, Asunción Gómez-Pérez

2018 (modified: 06 Nov 2022)SAC 2018Readers: Everyone

Abstract: DBpedia releases consist of more than 70 multilingual datasets that cover data extracted from different language-specific Wikipedia instances. The data extracted from those Wikipedia instances are transformed into RDF using mappings created by the DBpedia community. Nevertheless, not all the mappings are correct and consistent across all the distinct language-specific DBpedia datasets. As these incorrect mappings are spread in a large number of mappings, it is not feasible to inspect all such mappings manually to ensure their correctness. Thus, the goal of this work is to propose a data-driven method to detect incorrect mappings automatically by analyzing the information from both instance data as well as ontological axioms. We propose a machine learning based approach to building a predictive model which can detect incorrect mappings. We have evaluated different supervised classification algorithms for this task and our best model achieves 93% accuracy. These results help us to detect incorrect mappings and achieve a high-quality DBpedia.

0 Replies