High-Quality Noise Detection for Knowledge Graph Embedding with Rule-Based Triple Confidence

Yan Hong; Chenyang Bu; Xindong Wu

High-Quality Noise Detection for Knowledge Graph Embedding with Rule-Based Triple Confidence

Yan Hong, Chenyang Bu, Xindong Wu

Published: 01 Jan 2021, Last Modified: 15 Feb 2025PRICAI (1) 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Knowledge representation learning is usually used in knowledge reasoning and other related fields. Its goal is to use low-dimensional vectors to represent the entities and relations in a knowledge graph. In the process of automatic knowledge graph construction, the complexity of unstructured text and the incorrect text may make automatic construction tools unable to accurately obtain the semantic information in the text. This leads to high-quality noise with matched entity types but semantic errors. Currently knowledge representation learning methods assume that the knowledge in knowledge graphs is completely correct, and ignore the noise data generated in the process of automatic construction of knowledge graphs, resulting in errors in the vector representation of entities and relations. In order to reduce the negative impact of noise data on the construction of a representation learning model, in this study, a high-quality noise detection method with rule information is proposed. Based on the semantic association between triples in the same rule, we propose the concept of rule-based triple confidence. The calculation strategy of triple confidence is designed inspired by probabilistic soft logic (PSL). The influence of high-quality noise data in the training process of the model can be weakened by this confidence. Experiments show the effectiveness of the proposed method in dealing with high-quality noise.

Loading