Multlingual Abstractive Event Extraction for the Real World

ICLR 2025 Conference Submission13508 Authors

28 Sept 2024 (modified: 13 Oct 2024)ICLR 2025 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: dataset, event extraction, multilingual, zero-shot, entity linking
TL;DR: We present a new abstractive formulation for the event extraction task, along with a new dataset covering 16 languages and a novel zero-shot system for it.
Abstract: Event extraction (EE) is a valuable tool for making sense of large amounts of unstructured data, with a wide range of real-world applications, from studying disease outbreaks to monitoring political violence. Current EE systems rely on cumbersome mention-level annotations, and event arguments are frequently restricted to ungrounded spans of text, which hinders the aggregation and analysis of extracted events. In this paper, we define a new abstractive event extraction (AEE) task that moves away from the surface form and instead requires a deeper wholistic understanding of the input text. To support research in this direction, we release a new multilingual, expert-annotated event dataset called Lemonade, which covers 16 languages, including several for which no event dataset currently exists. Lemonade has 41,148 events, and is based on the Armed Conflict Location and Event Data Project, which has been collecting and coding data on political violence around the globe for over a decade. We introduce a novel zero-shot AEE system Zest that achieves a score of 57.2% F1 on Lemonade. With our supervised model that achieves 71.6% F1, they represent strong baselines for this new dataset.
Supplementary Material: zip
Primary Area: datasets and benchmarks
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 13508
Loading