Abstract: In Open Information Extraction (OpenIE), the acquisition of manually annotated sentence-extraction pairs is expensive, while automatically labeled datasets may struggle to accurately reflect real-world requirements for OpenIE systems. Existing neural models often demonstrate impressive performance on large-scale training sets but falter when tested on smaller-scale datasets due to the discrepancy in attributes between the training and test sets. In real-world scenarios, it is crucial for OpenIE systems to align closely with test sets, even when faced with limited annotated data for training.This paper introduces CycleOIE, a novel training framework applied to a pair of inverse text-to-text models. Through CycleOIE, we train a pair of T5 models on our curated dataset, LSOIE-g, achieving performance levels that surpass baselines trained on significantly larger fully supervised training sets. Ablation studies offer a detailed comparison between fully supervised training and CycleOIE, highlighting the effectiveness of CycleOIE on LSOIE-g as the primary factor in enhancing T5's OpenIE performance.
Paper Type: long
Research Area: Information Extraction
Contribution Types: Approaches to low-resource settings
Languages Studied: English
0 Replies
Loading