Hybrid Retrieval Systems Based on LLMs Embedding and Enhancement

08 Jul 2024 (modified: 15 Aug 2024)KDD 2024 Workshop OAGChallenge Cup SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Retrieval System, Data Enhancement, Candidate Generation, Ranking Candidates
TL;DR: The Onewo algorithm team has developed an innovative approach consisting of four stages: i.e. data processing and enhancement with LLMs, candidate generation, ranking candidates, and weighted ensemble.
Abstract: With the increasing popularity of large language models (LLMs), the retrieval of accurate and effective scientific research documents has become crucial in enhancing the ability of language models to answer user questions. To address this challenge, the AQA-KDD-2024 competition is launched by Tsinghua University's Knowledge Engineering Group (KEG), in collaboration with ZhipuAI. In the competition, the Onewo algorithm team has developed an innovative approach consisting of four stages: i.e. data processing and enhancement with LLMs, candidate generation, ranking candidates, and weighted ensemble. A key highlight of our approach is data enhancement, where the LLMs model is utilized to improve query and body texts. This involves generating keywords and providing AI responses based on an effective prompt template. Through our test benchmark, we have achieved a significant improvement in the performance metric. The score has progressed from 0.16526 to an enhanced score of 0.18367. With our innovative solution, our team Onewo won the 8th place in the final leaderboard of the AQA-KDD-2024 competition. The code is available at this link: https://github.com/Starrylun/AQA-KDD-2024-Rank8
Submission Number: 2
Loading