Knowledge graph completion (KGC) seeks to predict missing entities (e.g., heads or tails) or relationships in knowledge graphs (KGs), which often contain incomplete data. Traditional embedding-based methods, such as TransE and ComplEx, have improved tail entity prediction but struggle to generalize to unseen entities during testing. Textual-based models mitigate this issue by leveraging additional semantic context; however, their reliance on negative triplet sampling introduces high computational overhead, semantic inconsistencies, and data imbalance. Recent BERT-based approaches, like KG-BERT, show promise but depend heavily on entity descriptions, which are often unavailable in KGs. Critically, existing methods overlook valuable structural information in the KG related to the entities and relationships. To address these challenges, we propose Context-Aware BERT for Knowledge Graph Completion (CAB-KGC), a novel model that utilizes contextual information from linked entities and relations within the graph to predict tail entities. CAB-KGC eliminates the need for entity descriptions and negative triplet sampling, significantly reducing computational complexity while enhancing performance. Additionally, we introduce the Evaluation based on Distance from Average Solution (EDAS) criterion to the KG domain, enabling a more comprehensive evaluation across diverse metrics. Our experiments on standard datasets, including FB15k-237, WN18RR, CoDEx-S, and ConceptNet100K, demonstrate that CAB-KGC outperforms state-of-the-art methods on three datasets. Notably, CAB-KGC achieves improvements in Hit@1 of 6.88%, 14.32%, and 17.13% on WN18RR, CoDEx-S, and ConceptNet100K, respectively. Furthermore, EDAS rankings establish CAB-KGC as the top-performing model, highlighting its effectiveness and robustness for KGC tasks.
Keywords: Knowledge Graph Completion (KGC), Tail Prediction, Context-Aware BERT (CAB-KGC), Large Language Models (LLMs), BERT
TL;DR: This paper introduces Context-Aware BERT (CAB-KGC) for knowledge graph completion, addressing tail prediction challenges by leveraging contextual information about neighbouring nodes and relationships.
Abstract:
Primary Area: learning on graphs and other geometries & topologies
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 10755
Loading