Jointly Canonicalizing and Linking Open Knowledge Base via Unified Embedding Learning

Published: 23 Jan 2024, Last Modified: 23 May 2024TheWebConf24 OralEveryoneRevisionsBibTeX
Keywords: Open Knowledge Base Canonicalization, Open Knowledge Base Linking, Unified Embedding Learning
TL;DR: We propose a novel framework CLUE which can resolve Open Knowledge Base (OKB) canonicalization and OKB linking simultaneously and make these two tasks mutually reinforce each other via a unified embedding space.
Abstract: Recent years have witnessed increasing attention on the semantic knowledge integration between curated knowledge bases (CKBs) and open knowledge bases (OKBs), which is non-trivial due to the intrinsically heterogeneous features involved in CKBs and OKBs. OKB canonicalization (i.e., grouping synonymous noun phrases and relation phrases) and OKB linking (i.e., linking noun phrases and relation phrases with their corresponding entities and relations in a CKB) are regarded as two vital tasks to achieve the knowledge integration. Although these two tasks are inherently complementary with each other, previous studies just solve them separately or via superficial interaction. To address this issue, we propose CLUE, a novel framework that jointly encodes the OKB and CKB into a unified embedding space, to tackle OKB canonicalization and OKB linking simultaneously and make them benefit each other reciprocally. We design an expectation-maximization (EM) based approach to iteratively refine the unified embedding space via performing seed generation and embedding refinement alternately, by leveraging the deep interaction between OKB canonicalization and OKB linking. Curriculum learning is employed to yield high-quality canonicalization seeds and linking seeds adaptively, according to two elaborately designed metrics (i.e., a margin-based linking metric and an entropy-based cluster metric). A thorough experimental study over two public benchmark data sets demonstrates that our proposed framework CLUE consistently outperforms state-of-the-art baselines for the task of OKB canonicalization (resp. OKB linking) in terms of average F1 (resp. accuracy).
Track: Semantics and Knowledge
Submission Guidelines Scope: Yes
Submission Guidelines Blind: Yes
Submission Guidelines Format: Yes
Submission Guidelines Limit: Yes
Submission Guidelines Authorship: Yes
Student Author: No
Submission Number: 2371
Loading