Find your Needle: Small Object Image Retrieval via Multi-Object Attention Optimization

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Image Retrieval, Instance Retrieval, Personalization, Feature Aggregation, Feature Optimization
TL;DR: We suggest a novel method and new benchmark for image retrieval with small objects
Abstract: We address the challenge of Small Object Image Retrieval (SoIR), where the goal is to retrieve images containing a specific small object, in a cluttered scene. The key challenge in this setting is constructing a single image descriptor, for scalable and efficient search, that effectively represents all objects in the image. In this paper, we first analyze the limitations of existing methods on this challenging task and then introduce new benchmarks to support SoIR evaluation. Next, we introduce \ours (\oursMI), a novel retrieval framework which incorporates a dedicated multi-object pre-training phase. This is followed by a refinement process that leverages attention-based feature extraction with object masks, integrating them into a single unified image descriptor. Our \oursMI approach significantly outperforms existing retrieval methods and strong baselines, achieving notable improvements in both zero-shot and lightweight multi-object fine-tuning. We hope this work will pave the way and inspire further research to enhance retrieval performance for this highly practical task.
Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)
Submission Number: 21420
Loading