Abstract: We present an efficient approach for training dual-encoder models in large-scale retrieval tasks, aimed at reducing computational overhead and improving retrieval performance. Our method introduces a novel combination of similarity-based large negative sampling and direct gradient updates to cached target embeddings. By leveraging a pre-trained encoder to initialize target embeddings and storing them in a buffer, we eliminate the need for frequent recomputation, thereby reducing both computational cost and memory usage. Negative samples are selected from the top-$k$ most similar target embeddings within the batch and across queries, and cached embeddings are updated directly through gradient descent. Additionally, we utilize the Faiss library to manage nearest neighbor search, periodically rebuilding the index to maintain efficiency. Our approach accelerates training and improves retrieval accuracy, especially for quantized index types, providing a scalable solution for large-scale retrieval tasks that balances both computational efficiency and retrieval precision.
Paper Type: Long
Research Area: Information Retrieval and Text Mining
Research Area Keywords: dense retrieval
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 2164
Loading