Abstract: Biomedical document-level relation extraction is a crucial technology for mining the biomedical relationships necessary for clinical diagnosis, treatment, and medical discovery. Although existing intrasentential relation extraction methods have achieved significant results, the complexity and scattered nature of information in biomedical literature require relation extraction techniques to effectively handle cross-sentence information. For example, existing methods have not been able to explicitly model the phenomena of coreference and anaphor in documents, thus affecting the model’s understanding of complex semantics within the document. To address this issue, we propose a new document-level relation extraction model with coreference and anaphor graphs. By abstracting the document into an undirected graph that includes coreference and anaphor information, the framework effectively models the interactions between entities and leverages graph convolutional network in conjunction with pretrained language model to dynamically understand graph structures. Additionally, the shift from fine-grained entity-pair level to coarse-grained document-level training and inference significantly enhances the model’s efficiency while maintaining high extraction performance. Extensive experiments demonstrate that our model achieves a 5.3% increase in F1-score over baseline models on the BioRED dataset with higher efficiency, confirming its effectiveness in handling relation extraction tasks in complex biomedical literature.
Loading