Image retrieval with self-supervised divergence minimization and cross-attention classification

Published: 07 Jul 2025, Last Modified: 07 Mar 2025OpenReview Archive Direct UploadEveryoneCC BY 4.0
Abstract: Common approaches for image retrieval include contrastive methods and specialized loss functions such as ranking losses and entropy regularizers. We present DMCAC (Divergence Minimization with Cross-Attention Classification) which offers a new perspective on this training paradigm. We use self-supervision with a novel divergence loss framework alongside a simple data flow adjustment that minimizes the distributional divergence over a database directly during training. We show that jointly learning the query representation over a database is a competitive and often improved alternative to contrastive and other methods for image retrieval. We evaluate our method across several model configurations and four datasets, achieving state-of-the-art performance in multiple settings. We also conduct a thorough set of ablations that show the robustness of our method across full vs. approximate retrieval and different hyperparameter configurations.
Loading