Chinese Event Extraction from Handwritten Image

ACL ARR 2026 January Submission243 Authors

22 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Event Extraction; Handwriting Recognition
Abstract: Event extraction is crucial for distilling semantic information from sentences, has traditionally focused on text-based input. However, as a long-existing data source for the needs of rapid recording, handwriting somehow has long been neglected in prior works. In this study, we move our sight to a more practical setting that aims to directly extract downstream event information from handwritten images. Under this setting, we first construct a benchmark dataset with manual annotation of handwritten images. This dataset is more challenging due to the widely distributed corner cases in handwritten images such as typos and scribbled writing styles, not even mention the serious cross-modality error propagation. We further propose Chinese Handwriting Vision-Language Model (HVLM) that consists of three joint training subtasks targeted at the challenging root of this setting. By emulating human reading habits, our model can quickly scan and precisely locate key information within the image, thereby improving the overall performance of extraction. Experiments demonstrate the huge advantage of our proposed model over the cutting-edge baselines, underscoring the necessity of introducing this new setting thereby guiding the holistic optimization on this real-world challenges.
Paper Type: Long
Research Area: Information Extraction and Retrieval
Research Area Keywords: event extraction;
Contribution Types: Approaches to low-resource settings, Data resources
Languages Studied: Chinese
Submission Number: 243
Loading