An Automatic Topic-oriented Structured Text Extraction Method based on CRF and Deep LearningDownload PDF

12 May 2023 (modified: 15 May 2023)OpenReview Archive Direct UploadReaders: Everyone
Abstract: Automatic extraction of text information plays an important role in machine translation, knowledge mapping and other fields. In recent years, with the rapid development of computer technology and the popularization of Internet application, the resources acquired by people through the Internet show explosive growth. Facing the massive information resources, how to extract the required information quickly and effectively and convert it into structured data has become a hot topic of current research. Based on this, this paper proposes a automatic topic-oriented text extraction method combining BiLSTM and CRF models. This method firstly establishes the text extraction topic, then carries on the automatic entity recognition to the data related topic, finally forms the standard structured data, thereby realizes the unstructured text data to the specific structure of the information block, which lays the foundation for knowledge mining. Taking the data collected in ACL 2018 Chinese NER as the test data, the precision of our algorithm is 95.74%. Compared with the traditional neural network method, our text information extraction method can effectively identify more entity information in the text data, and improve its effect in practical application.
0 Replies

Loading