Complex Open Information Extraction with Heterogeneous Syntax Forests

Published: 2025, Last Modified: 16 Mar 2026ICASSP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Open Information Extraction (OIE) aims at extracting the relational triplets from open-domain texts. Existing methods, unfortunately, mostly fall prey to the complex OIE setting, due to the failure to extract unseen words and underutilize syntactic features. In this work, we propose a novel system tailored for complex OIE, where a generative PLM with non-autoregressive generative decoding is adopted for abstractive OIE generation. We introduce a heterogeneous syntactic forests as features for the task, merging constituency and dependency forests into a unified syntax graph, aiding in better detecting potential boundaries and relationships within complex terms. Also an Implicit-Explicit Contrastive Learning mechanism is devised to boost the perception of implicit relations and spans. Our system outperforms the current state-of-the-art model across seven OIE datasets.
Loading