TriSampler: A Better Negative Sampling Principle for Dense RetrievalDownload PDF


17 Apr 2023ACL ARR 2023 April Blind SubmissionReaders: Everyone
Abstract: Negative Sampling is an essential technique for dense retrieval that can be utilized to effectively train retrieval models, which significantly effects the retrieval performance. While existing negative sampling methods have already achieved promising results by leveraging hard negatives, there still lacks a general principle to guide negative sampling, including negative candidate construction and negative sampling distribution design. To address it, we conduct a theoretical analysis of negative sampling in dense retrieval and propose the quasi-triangular principle to illustrate the triangular-like relationship among query, positive document, and negative document. Relying on this principle, we develop a simple yet effective negative sampling method, TriSampler, which aims to sample more informative negatives within a constrained region. Experimental results indicate that our TriSampler can achieve superior retrieval performance across various representative retrieval models.
Paper Type: long
Research Area: Information Retrieval and Text Mining
0 Replies
