PEACE - An In-Domain and Cross-Domain Chinese Proposition Classification Evaluation Benchmark for Natural Language ProcessingDownload PDF

Anonymous

16 Dec 2022 (modified: 05 May 2023)ACL ARR 2022 December Blind SubmissionReaders: Everyone
Abstract: A large number of propositions with rich expressions exist in natural language. Correct classification of propositions is helpful to natural language understanding and reasoning. However, most of the existing researches are limited by logical constants, while most propositions in natural languages are implicit. And there is a lack of complete proposition classification system, resources and research on cross-domain tasks. We propose the concept of implicit proposition which is more suitable for NLP application scenarios. And we present PEACE, for in-domain and cross-domain proposition classification tasks, covers all tasks related to proposition classification, among which the task of categorical proposition classification is put forward for the first time, which is a large-scale proposition classification data set with implicit propositions. It contains over 45k sentences, multi-level classes and 5 different domains. We use PEACE as a benchmark dataset and propose a series of proposition classification tasks. We use multiple popular machine learning methods as our baseline methods and run experiments on each task. The results show that the existing pre-training models can classify all kinds of propositions relatively well, but the cross-domain tasks of non-modal proposition classification is still challenging. We release this benchmark with the hope of advancing research in natural language understanding, reasoning, and generation.
Paper Type: long
Research Area: Resources and Evaluation
0 Replies

Loading