Abstract: Content Linking(CL) is a new task in natural language processing(NLP) recently. In this paper, CL is regarded as a binary classification task using neural networks. The input is the feature of the sentence pair, while the output is their association. Word embedding is the most popular text feature representations. However, its dimensionality would be large, and simply stitching the word embedding features of two words couldn't contain the relationship between them. We propose a new text feature named Word2vec_V_I by Principal Component Analysis (PCA) and similarity calculation. Furthermore, a content linking method based on CNN and Word2vec_V_I is designed and implemented. Experiments on CL-SciSumm 2018 dataset verify the effectiveness of this method.
0 Replies
Loading