A Framework for Effective Knowledge Extraction from A Data Space Formed by Unstructured Technical Reports using Pre-trained ModelsDownload PDFOpen Website

Published: 01 Jan 2021, Last Modified: 12 May 2023ICEBE 2021Readers: Everyone
Abstract: The transformation of unstructured data into triples is a key task in knowledge graph construction. It remains a great challenge to complete this task on technical reports. In this work, we propose a framework for effectively structuring data structuring in knowledge graph construction from a data space formed by technical reports. This framework specifically consist of two pre-trained language models to provide the embed dings and a sequence labeling model to tag the entity labels. The pre-trained models, i.e. the Flair embedding and the BERT model, are employed to combine the output vectors to downstream tasks. To evaluate the proposed method, we conduct named entity recognition experiments using the status reports of complex equipment in nuclear power plants. The evaluation shows the framework achieves remarkable improvement on F1 score. This paper details the framework, the experiments, and the evaluation of the proposed method.
0 Replies

Loading