Abstract: Dense retrievers utilize pretrained language models to encode queries and documents as high-dimensional embeddings for retrieval. Nevertheless, these high-dimensional embeddings usually result in more expensive index storage and higher retrieval latency. In this paper, we further explore the potential of building a lightweight dense retrieval system by combining the dimension reduction in the encoding-indexing pipeline. Our experiments demonstrate that the encoding-compression-indexing-retrieval method can conduct an efficient dense retrieval system, reducing the retrieval latency of 96%, while maintaining comparable retrieval effectiveness. Our further analyses illustrate that the dimensional reduction method can broaden its retrieval effectiveness in different domains and cooperate with different index building methods.
Loading