Student Lead Author Indication: No
Keywords: Retrieval-Augmented Generation, RAG, Large Language Model, LLM, Embedded devices, Chunking Strategies
TL;DR: HiRAG: a RAG pipeline designed for embedded applications, which delivers concise yet comprehensive contextual information to the LLM prompt.
Abstract: Retrieval-Augmented Generation effectively overcomes Large Language Models knowledge limitations. This approach improves answer quality by incorporating additional context into the input prompt. In this paper, we propose the Human-inspired Retrieval-Augmented Generation (HiRAG) that reduces prompt sizes by creating and retrieving short yet comprehensive contexts. Our experiments demonstrate that HiRAG improves retrieval accuracy with smaller embedding models, especially for technical data. On average, HiRAG condenses more knowledge into shorter prompts while preserving context quality. This reduction lowers latency during inference, making it well-suited for embedded devices.
Submission Number: 16
Loading