Information Extraction from Legal Wills: How Well Does GPT-4 Do?

Alice Saebom Kwak; Cheonkam Jeong; Gaetano Vincent Forte; Derek Bambauer; Clayton T Morrison; Mihai Surdeanu

Information Extraction from Legal Wills: How Well Does GPT-4 Do?

Alice Saebom Kwak, Cheonkam Jeong, Gaetano Vincent Forte, Derek Bambauer, Clayton T Morrison, Mihai Surdeanu

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 FindingsEveryoneRevisionsBibTeX

Submission Type: Regular Short Paper

Submission Track: Resources and Evaluation

Submission Track 2: Information Extraction

Keywords: Information Extraction, Legal Natural Language Processing

TL;DR: This work presents a manually annotated dataset for Information Extraction (IE) from legal wills, and relevant in-context learning experiments on the dataset.

Abstract: This work presents a manually annotated dataset for Information Extraction (IE) from legal wills, and relevant in-context learning experiments on the dataset. The dataset consists of entities, binary relations between the entities (e.g., relations between testator and beneficiary), and n-ary events (e.g., bequest) extracted from 45 legal wills from two US states. This dataset can serve as a foundation for downstream tasks in the legal domain. Another use case of this dataset is evaluating the performance of large language models (LLMs) on this IE task. We evaluated GPT-4 with our dataset to investigate its ability to extract information from legal wills. Our evaluation result demonstrates that the model is capable of handling the task reasonably well. When given instructions and examples as a prompt, GPT-4 shows decent performance for both entity extraction and relation extraction tasks. Nevertheless, the evaluation result also reveals that the model is not perfect. We observed inconsistent outputs (given a prompt) as well as prompt over-generalization.

Submission Number: 5063

Loading