A Meta-pattern-enhanced Generative Few-shot Attribute Extraction Framework for Open-world Sparse Corpora

Published: 2024, Last Modified: 05 Nov 2025MSN 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Open-world attribute extraction is one of the most important tasks of information extraction aiming to mine all the valuable attributes of entities and their corresponding values from unstructured texts, usually in the form of (entity, attribute, value) triplets. However, existing methods have difficulty extracting attribute triplets from open-world sparse corpora where the attribute names are not previously given, especially in the few- shot scenario with only few manual annotations available. To solve the above problems, we propose a two-stage Meta-pattern-Enhanced Generative Few-shot Attribute Extraction (MEGFAE) framework which can be used to discover utmost valuable attribute triplets from open-world sparse corpora in a generative manner. For evaluation on open-world sparse corpora, we introduce a benchmark dataset called OSN-51511The dataset is available in https://github.com/sunshower-liu/OSN-515.. Experimental results verifies the effectiveness of our framework and inspires future explorations on the text mining on sparse corpora.
Loading