KG-Cleaner: Jointly Learning to Identify and Correct Errors Produced by Information Extraction Systems


Nov 17, 2018 AKBC 2019 Conference Blind Submission readers: everyone Show Bibtex
  • Keywords: knowledge graph, semantics-aware
  • TL;DR: An approach to clean noisy extraction considering an IE system as a blackbox
  • Abstract: KG-Cleaner is a semantics-aware framework combining two different approaches for knowledge correction: text-based classification to identify, and schema backed classification to correct, errors in data produced by information extraction systems. The approach is novel in being an independent, standalone system addressing both tasks in a unified, joint manner. We evaluate KG-Cleaner and other models on two collections: a Wikidata corpus of 700K facts and 5M fact-relevant sentences and a collection of 30K facts extracted by systems participating in the 2015 TAC Knowledge Base Population task. We find that simple parameter-efficient shallow neural networks, combined with a continuous relaxation of a discrete predicted latent variable, provide a good common representation for the two tasks, achieving absolute performance gains of 30-35 F1 points on the evaluation datasets for both credibility prediction and relation repair.
  • Archival status: Archival
  • Subject areas: Information Extraction
0 Replies