Towards a Novel Approach for Knowledge Base Population Using Distant Supervision

Published: 01 Jan 2024, Last Modified: 07 Feb 2025MCPR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Distant Supervision is an approach in Relation Extraction that automatically labels a dataset using a Knowledge Base as a guide. However, the incompleteness of Knowledge Bases poses a significant challenge, leading to incorrectly labeled sentences due to the absence of corresponding relations. This study introduces a novel approach to enhance and complete Knowledge Bases, aiming to reduce false negatives in labeling using Distant Supervision. A key aspect of this approach is determining whether an instance expresses a specific relation. To address this, it is proposed a novel unsupervised method based on Deep Embedding Clustering. The experiments conducted demonstrated the effectiveness of the proposed method, outperforming several state-of-the-art methods even when subjected to varying percentages of incorrectly labeled instances. Furthermore, the proposed method shows promising performance in identifying relations with similar characteristics. Finally, we evaluate various threshold levels to determine the presence of a specific relation in an instance.
Loading