Observations on Building RAG Systems for Technical Documents

Published: 19 Mar 2024, Last Modified: 31 Mar 2024Tiny Papers @ ICLR 2024 ArchiveEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Retrieval Augmented Generation, Large Language Models, Natural Language Processing, Embedding Models, Chatbots, Technical Documents
TL;DR: Evaluation of parameters affecting Retrieval Augmented Generation for Technical Documents.
Abstract: Retrieval augmented generation (RAG) for technical documents creates challenges as embeddings do not often capture domain information. We review prior art for important factors affecting RAG and perform experiments to highlight best practices and potential challenges to build RAG systems for technical documents.
Supplementary Material: zip
Submission Number: 108
Loading