Supervised Relation Extraction is More Efficient When Approached as Graph-Based Dependency Parsing

Supervised Relation Extraction is More Efficient When Approached as Graph-Based Dependency Parsing

ACL ARR 2025 July Submission558 Authors

28 Jul 2025 (modified: 22 Aug 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large language models (LLMs) have emerged as a convenient tool for the relation extraction (RE) task, both in supervised and in-context learning settings. However, their supervised performance still lags behind much smaller architectures, which we argue is because of two main reasons. (i) For LLMs, both input and labels live in the same prompt space, which makes it necessary for both to be expanded into natural language, decreasing information density. (ii) An LLM has to generate from scratch the entities, entity labels, and relation labels by classifying over the entire vocabulary, while also formatting the output so that predictions can be automatically extracted from the generated output. To show this, we evaluate LLMs and graph-based parsers on six RE datasets with sentence graphs of varying sizes and complexities. Our results show that LLM performance increasingly degrades, compared to graph-based parsers, as the number of relations in documents increases, arguably making the latter a superior choice in the presence of complex annotated data.

Paper Type: Short

Research Area: Efficient/Low-Resource Methods for NLP

Research Area Keywords: relation extraction, biaffine attention, large language model, graph-based parser

Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency

Languages Studied: English

Reassignment Request Area Chair: This is not a resubmission

Reassignment Request Reviewers: This is not a resubmission

A1 Limitations Section: This paper has a limitations section.

A2 Potential Risks: N/A

B Use Or Create Scientific Artifacts: Yes

B1 Cite Creators Of Artifacts: Yes

B1 Elaboration: We cite every dataset (Section 3) and model (Section 4).

B2 Discuss The License For Artifacts: Yes

B2 Elaboration: We cite every dataset (Section 3) and model (Section 4).

B3 Artifact Use Consistent With Intended Use: Yes

B3 Elaboration: We respect all the licenses of every dataset (Section 3) and model (Section 4).

B4 Data Contains Personally Identifying Info Or Offensive Content: N/A

B4 Elaboration: No personal data or offensive content is found in the used data (Section 3).

B5 Documentation Of Artifacts: Yes

B5 Elaboration: We discuss data domains in Section 3.

B6 Statistics For Data: Yes

B6 Elaboration: We provide short statistics for the used datasets in Section 3 and in Appendix A.

C Computational Experiments: Yes

C1 Model Size And Budget: Yes

C1 Elaboration: We report parameters in Section 4 and compute requirements in Section 7.

C2 Experimental Setup And Hyperparameters: Yes

C2 Elaboration: We report hyperparameters in section 4.

C3 Descriptive Statistics: Yes

C3 Elaboration: We report means and standard deviations for F1 scores in Section 5 over multiple seeds.

C4 Parameters For Packages: N/A

C4 Elaboration: We use the default PyTorch and Transformers libraries.

D Human Subjects Including Annotators: No

D1 Instructions Given To Participants: N/A

D2 Recruitment And Payment: N/A

D3 Data Consent: N/A

D4 Ethics Review Board Approval: N/A

D5 Characteristics Of Annotators: N/A

E Ai Assistants In Research Or Writing: No

E1 Information About Use Of Ai Assistants: N/A

Author Submission Checklist: yes

Submission Number: 558

Loading