MERLIN: Multiple Enhanced Representations with LLM Generated INdices

Published: 31 May 2024, Last Modified: 20 Jun 2024Gen-IR_SIGIR24EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Information Retrieval, Generative AI, Generative Models, Ranking
TL;DR: Leverage LLM generated indices to improve retrieval performance
Abstract: Large Language Models(LLMs) can be leveraged to improve perfor- mance in various stages of the search pipeline - index enhancement , query rewriting, and ranking or re-ranking. The latter two methods involve using large language model calls during inference, adding latency in fetching the final ranked list of documents. Index en- hancement, on the other hand can be done in the indexing phase in near real time, and can result in improved retrieval performance while adding no or minimal additional latency during query-time inference. Enhancing indexes with information generated by LLMs is a promising mechanism to improve first stage retrieval results in dense retrieval using bi-encoders, on par or exceeding the other two approaches. In this work, we show that by using multiple indexes to represent documents in different ways, where the representations are generated by an LLM, and querying these indexes in parallel, we can improve retrieval performance with almost no increase in runtime latency. Our results are consistent across a number of pre- trained bi-encoder models. We detail the implementation of such a system in an industrial setting with AWS services in the customer service domain to help retrieve the correct self-help content for an amazon customer query.
Submission Number: 12
Loading