HaloRAG: Towards Mitigating LLM Hallucinations with Low-Cost Real-Time Retrieval

HaloRAG: Towards Mitigating LLM Hallucinations with Low-Cost Real-Time Retrieval

ACL ARR 2024 June Submission2873 Authors

15 Jun 2024 (modified: 06 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large Language Models (LLMs) often struggle to stay up-to-date due to their reliance on static datasets, leading to outdated responses and hallucinations. We introduce HaloRAG, a cost-efficient agentic wrapper that enhances LLMs with real-time information retrieval using advanced web scraping technologies. Leveraging semantic searches and Retrieval-Augmented Generation (RAG), this wrapper fetches, validates, and summarizes up-to-date web data, extending the LLM’s knowledge base without retraining. This method significantly enhances the accuracy and relevance of LLM responses, particularly for queries requiring the latest information. Comparative analysis indicates that the wrapper-enhanced LLM outperforms models like GPT-3.5 and Claude on queries involving recent events and emerging technologies. This work advocates for integrating real-time data retrieval techniques to significantly reduce hallucinations and extend the practical applicability of LLMs across various domains.

Paper Type: Short

Research Area: Efficient/Low-Resource Methods for NLP

Research Area Keywords: Large Language Models, Retrieval Augmented Generation, Low-resource LLMs

Contribution Types: Approaches to low-resource settings, Approaches low compute settings-efficiency

Languages Studied: English

Submission Number: 2873

Loading