Abstract: Event extraction (EE) is the task of identifying events and their types, along with the involved arguments. Despite the great success in sentence-level event extraction, events are more naturally presented in the form of document, with event arguments scattering in multiple sentences. However, a major barrier to promote document-level event extraction has been the lack of large-scale and practical training and evaluation datasets. In this paper, we present DocEE, a new document-level EE dataset including 20,000+ events, 100,000+ arguments. We highlight three features: large-scale annotations, fine-grained event arguments and application-oriented settings. Experiments show that even SOTA models show inferior performance on DocEE, especially in cross-domain settings, indicating that DocEE is still a challenging task. We will publish DocEE upon acceptance.
0 Replies
Loading