Constraint-based Paraphrase Generation Model and it's Application on Message Extraction of Commodity Futures News
Abstract: The task of message extraction based on the Chinese futures domain is to extract the messages that may affect the futures market from the futures news or comment texts. In this work, we construct a Futures Messages Extraction Dataset (FMED) by extracting message-related entities from the Corpus for Entity and Relationship in Futures domain (CERF) and propose a message extraction model MGCC based on paraphrase generation. We define prompt templates for this task and use the pre-trained model mT5. In order to improve the model's performance, the constraint and the classifier are used to generate the target sentences. The final precision, recall and F1 values in the validation set are 0.732, 0.733 and 0.732, respectively.
Loading