On the Theoretical Advantages of Bilinear Similarities in Dense Retrieval

Published: 2025, Last Modified: 20 Dec 2025SISAP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We present a theoretical and empirical study of bilinear similarity functions in neural IR, showing that they are strictly more expressive than dot-product and weighted dot-product (WDP) models under fixed embeddings. We prove this separation formally and illustrate it with the Structured Agreement Ranking Task, where a simple rank-2 bilinear model achieves 100% accuracy while all WDP models fail. This highlights the importance of modeling feature interactions for conditional relevance. On MS MARCO, low-rank bilinear models significantly outperform dot-product baselines: a rank-32 model triples performance (MRR@10: 0.090 vs. 0.031), and rank-128 approaches a 4\(\times \) gain. These results offer a principled and practical case for using low-rank bilinear models in dense retrieval. Code: https://github.com/shubham526/bilinear-projection-theory.
Loading