SDB-DRE: Learning Structure, Definition and Boundary Makes LLMs Better Document-Level Relation Triplet Extractors
Abstract: Recent years, Large Language Models (LLMs) demonstrate superior performance in information extraction tasks. Leveraging these models for Document-Level Relation extraction (DocRE) will benefits from their powerful generative capabilities. However, we observe that LLMs still face challenges in DocRE tasks: Document Structure Parsing Error, Relation Definition Ambiguity, and Entity Boundary Recognition Error. To address these issues, we propose SDB-DRE, an LLM-based DocRE model that does not rely on pre-labeled entities. To tackle the Document Structure Parsing Error, we introduce a novel Structure-Aware QA training approach, enabling LLMs to learn coreference relationships and entity types within the document. To resolve Relation Definition Ambiguity and Entity Boundary Recognition Error, we introduce relation definition learning and mention boundary learning in the second stage of relation extraction training. This improves the internal document representation of the LLM, ensuring the output triples are consistent with the relation definitions and have more accurate entity boundaries. Experimental results show that SDB-DRE outperforms LLM-based methods using multi-stage inference in a single-stage reasoning setup, achieving state-of-the-art performance.
Paper Type: Long
Research Area: Information Extraction
Research Area Keywords: Document-Level Relation Triplet Extract
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models
Languages Studied: English
Submission Number: 5380
Loading