Research on semantic representation and citation recommendation of scientific papers with multiple semantics fusion
Abstract: With the growth in scientific papers, citation recommendation which enables researchers to find useful references efficiently and further to promote academic communication and cooperation has become increasingly important. However, little research has been done to explore how to recognize the semantically relevant references according to research scenarios and the context of the paper citation. Motivated by the research gap, the present study attempts to adopt SciBERT to represent text and expand its semantics through the fusion of WordNet knowledge. Further, core themes from references are automatically extracted by TextRank to solve the problem of incomplete content extraction. In this case, the model named SciBERT + DPCNN is constructed for semantic representation and citation recommendation of scientific papers. Afterwards, multiple experiments are designed and implemented in three parts to verify the effectiveness of the model. The first result is that the outcomes of SciBERT + DPCNN obtain the highest among all baseline models. Additionally, when the model performs in 1 WordNet fusion at the end of the sentence, the best outcomes are 84.72%, 84.80%, 84.72%, and 84.71% in terms of accuracy, precision, recall, and F1-score, respectively. Ultimately, for the classification results of the reference structure, the long text ‘title + abstract + TextRank full text (except the title and abstract)’ outperforms most short text ‘title + abstract’ without WordNet fusion. However, when WordNet is fused for the classification, the short text is mostly more accurate than the long text.
Loading