Keywords: Grafting Learning, Large Language Models, Knowledge Graph, Bert, Paper Source Tracing, KDDCup 2024
Abstract: This paper presents the solution of team BlackPearl in the KDD Cup 2024 OAG Challenge - PST (paper source tracing).
The goal of this competition is to identify "ref-sources" from the full texts of a given paper. A ref-source refers to the most important reference (called the "source paper"), which generally refers to the literature that has provided the greatest inspiration for this paper.
Our solution proposes an LLM (Large Language Models) sys- tem based on grafted learning, which fully leverages all noisy and noiseless data, transferring the output confidence of BERT models to the LLM. Additionally, we have developed an automatic fea- ture engineering pipeline based on RAG (Retrieval-Augmented Generation), effectively supplementing the knowledge graph in- formation of the paper. Our method ranks 1st place in the final leaderboard of Task PST. Our solution and code are publicly avail- able at this link: https://github.com/BlackPearl-Lab/KddCup-2024- OAG-Challenge-1st-Solutions/tree/main.
Submission Number: 10
Loading