Keywords: Event Extraction, Information Extraction
Abstract: It is crucial to understand a specific domain by events.
Extensive event extraction research has been conducted in many domains such as news, finance, and biology.
However, event extraction in scientific domain is still insufficiently supported by comprehensive datasets and tailored methods.
Compared with other domains, scientific domain has two characteristics: (1) denser nuggets and events, and (2) more complex information forms.
To solve the above problem, considering these two characteristics, we first construct SciEvents, a large-scale multi-event document-level dataset with a schema tailored for scientific domain.
It consists of 2,508 documents and 24,381 events under multi-stage manual annotation and quality control.
Then, we propose EXCEEDS, an end-to-end scientific event extraction framework by encoding dense nuggets into a grid matrix and simplifying complex event extraction as a nugget-based grid modeling task.
Experiments on SciEvents demonstrate state-of-the-art performances of EXCEEDS.
Both the SciEvents dataset and the EXCEEDS framework will be released publicly to facilitate future research.
Paper Type: Long
Research Area: Information Extraction and Retrieval
Research Area Keywords: Information Extraction
Contribution Types: NLP engineering experiment, Data resources
Languages Studied: English
Submission Number: 9840
Loading