Multi-source information fusion for tracing the sources of papers

Yuxuan Wu; Genhang Shen; Hongyu Fan

Multi-source information fusion for tracing the sources of papers

Yuxuan Wu, Genhang Shen, Hongyu Fan

12 Jul 2024 (modified: 12 Aug 2024)KDD 2024 Workshop OAGChallenge Cup SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Natural Language Processing, Text Similarity, Paper Ranking

Abstract: With the rapid advancement of academic data mining technologies, tracing the origins of papers has become more thorough and precise, aiding in the identification of pivotal papers that significantly impact entire research fields. The PST task in KDD Cup 2023 was introduced to address this challenge. This paper proposes an effective solution that employs a dual-network architecture to integrate information from both the papers and their references. To address the issue of insufficient information, we utilized data from DBLP Citations as well as contributions from the introduction and conclusion sections. Additionally, we set two optimization objectives: maximizing the similarity between related papers and minimizing the similarity between unrelated papers. The model was trained using reverse gradient propagation. Our team, pigpigwin, achieved 10th place in KDD CUP 2024, with a final test set score of 0.38159, demonstrating the effectiveness of our proposed solution.

Submission Number: 7

Loading