DocKS-RAG: Optimizing Document-Level Relation Extraction through LLM-Enhanced Hybrid Prompt Tuning

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Document-level relation extraction (RE) aims to extract comprehensive correlations between entities and relations from documents. Most of existing works conduct transfer learning on pre-trained language models (PLMs), which allows for richer contextual representation to improve the performance. However, such PLMs-based methods suffer from incorporating structural knowledge, such as entity-entity interactions. Moreover, current works struggle to infer the implicit relations between entities across different sentences, which results in poor prediction. To deal with the above issues, we propose a novel and effective framework, named DocKS-RAG, which introduces extra structural knowledge and semantic information to further enhance the performance of document-level RE. Specifically, we construct a Document-level Knowledge Graph from the observable documentation data to better capture the structural information between entities and relations. Then, a Sentence-level Semantic Retrieval-Augmented Generation mechanism is designed to consider the similarity in different sentences by retrieving the relevant contextual semantic information. Furthermore, we present a hybrid-prompt tuning method on large language models (LLMs) for specific document-level RE tasks. Finally, extensive experiments conducted on two benchmark datasets demonstrate that our proposed framework enhances all the metrics compared with state-of-the-art methods.
Lay Summary: Our work proposes DocKS-RAG, a framework for document-level relation extraction that enhances structural and semantic understanding by integrating a Document-level Knowledge Graph and a Sentence-level Semantic Retrieval-Augmented Generation mechanism. Additionally, we design a hybrid-prompt tuning method on large language models. Experiments on benchmark datasets show significant improvements over existing methods.
Primary Area: Deep Learning->Large Language Models
Keywords: Document-level relation extraction, Structural knowledge integration, Semantic retrieval, Hybrid-prompt tuning
Submission Number: 9793
Loading