Abstract: Relational Triple Extraction (RTE), one of the crucial components of information extraction, has experienced rapid development in recent years. However, due to the triple duplication problem in existing datasets, previous methods often yield highly competitive results by simply memorizing the duplicated triples rather than discovering the new triples from raw text. Specifically, In the two most widely-used datasets (NYT and WebNLG), more than 80% of the triples from the test set are direct duplicates of triples already present in their training set. In response to this, we propose a new dataset, named ENT, to evaluate the model's ability to Extract New Triples, which aligns more coherently with the objectives of the RTE task. Specifically, based on the Wikidata knowledge graph slices and Large Language Model Prompting, we design an RTE dataset construction pipeline. It consists of four steps, including: 1) Preprocess, 2) Paragraph Generation, 3) Rule-based Check and 4) Semantic Check. ENT comprises 300k+ unique triples with all the test set samples containing at least one new triple. We conduct a re-evaluation of nine existing state-of-the-art methods and observe a generalized 10%+ and 7.5%+ decrease in extraction accuracy on ENT compared to NYT and WebNLG respectively. This demonstrates that ENT is a more challenging and meaningful benchmark, and we hope it will lead to new directions in the study of the RTE task.
Paper Type: long
Research Area: Information Extraction
Contribution Types: Data resources
Languages Studied: English
Preprint Status: There is no non-anonymous preprint and we do not intend to release one.
A1: yes
A1 Elaboration For Yes Or No: In Section 7.
A2: yes
A2 Elaboration For Yes Or No: In Section 8.
A3: yes
A3 Elaboration For Yes Or No: The abstract is at the beginning of the PDF. Introduction is in Section 1.
B: yes
B1: yes
B1 Elaboration For Yes Or No: In Section 3 and 4.
B2: yes
B2 Elaboration For Yes Or No: In Section 3.2
B3: yes
B3 Elaboration For Yes Or No: In Section 8
B4: yes
B4 Elaboration For Yes Or No: In Section 8
B5: yes
B5 Elaboration For Yes Or No: In Section 3.3
B6: yes
B6 Elaboration For Yes Or No: In Section 3.3
C: yes
C1: yes
C1 Elaboration For Yes Or No: In Section 8.
C2: yes
C2 Elaboration For Yes Or No: In Section 4.
C3: yes
C3 Elaboration For Yes Or No: In Section 8.
C4: yes
C4 Elaboration For Yes Or No: In Section 4.
D: yes
D1: yes
D1 Elaboration For Yes Or No: In Appendix A.
D2: yes
D2 Elaboration For Yes Or No: In Appendix A.
D3: yes
D3 Elaboration For Yes Or No: In Appendix A.
D4: no
D4 Elaboration For Yes Or No: We don't have such institution.
D5: yes
D5 Elaboration For Yes Or No: In Appendix A.
E: yes
E1: yes
E1 Elaboration For Yes Or No: In Section 3.
0 Replies
Loading