SDB-DRE: Learning Structure, Definition and Boundary Makes LLMs Better Document-Level Relation Triplet Extractors

SDB-DRE: Learning Structure, Definition and Boundary Makes LLMs Better Document-Level Relation Triplet Extractors

ACL ARR 2025 May Submission5380 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Recent years, Large Language Models (LLMs) demonstrate superior performance in information extraction tasks. Leveraging these models for Document-Level Relation extraction (DocRE) will benefits from their powerful generative capabilities. However, we observe that LLMs still face challenges in DocRE tasks: Document Structure Parsing Error, Relation Definition Ambiguity, and Entity Boundary Recognition Error. To address these issues, we propose SDB-DRE, an LLM-based DocRE model that does not rely on pre-labeled entities. To tackle the Document Structure Parsing Error, we introduce a novel Structure-Aware QA training approach, enabling LLMs to learn coreference relationships and entity types within the document. To resolve Relation Definition Ambiguity and Entity Boundary Recognition Error, we introduce relation definition learning and mention boundary learning in the second stage of relation extraction training. This improves the internal document representation of the LLM, ensuring the output triples are consistent with the relation definitions and have more accurate entity boundaries. Experimental results show that SDB-DRE outperforms LLM-based methods using multi-stage inference in a single-stage reasoning setup, achieving state-of-the-art performance.

Paper Type: Long

Research Area: Information Extraction

Research Area Keywords: Document-Level Relation Triplet Extract

Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models

Languages Studied: English

Submission Number: 5380

Loading