Keywords: Retrieval Augmented Generation, Large Language Model, Multi-Agents
Abstract: The increasing popularity of Retrieval Augmented Generation (RAG) with Large Language Models (LLMs) has highlighted the need for enhanced responses to user queries by leveraging web content knowledge. Despite its potential, the challenge of integrating noisy external web information often results in hallucination, and the issue of consistently providing correct answers remains unresolved. To improve this research, Meta introduced a comprehensive dataset CRAG and hosted the KDD Cup 2024 Challenge to advance RAG system development.
This paper details our solution in the competition, which consists of a three-component pipeline: Pre-processing, Retrieval, and Multi-Agent Generation. Our strategy incorporates Query Rewriting, Reference Constraint, and Conditional False-premise Detection to improve accuracy and reduce hallucinations. Moreover, we propose a novel “Thought-Chain-Agent-Flow” technique in the Multi-Agent Generation, enhancing the LLM's focus on critical facts and reasoning capabilities. This approach demonstrated superior performance, leading our team, bumblebee7, to won first place in the multi-hop challenge of Task1 and maintain top positions in Task2 and Task3, competing against over 2000 participants.
Submission Number: 5
Loading