Keywords: tiny object detection+feature aggregation
TL;DR: We propose a Foreground Probing method that uses the relationships between reliable classification features to collectively enhance and refine the unreliable foreground scores.
Abstract: Detecting small objects in high-resolution images is challenging, as small targets are often overwhelmed by the surrounding background and thus prone to being missed or misclassified. To address this issue, this work proposes a \emph{foreground probing} paradigm to recover and refine suppressed foreground scores by leveraging collective classification features. At its core, we introduce a \emph{sparse token selection module} (STSM) that identifies potential foreground tokens across feature maps, for the sake of preserving promising candidates from being submerged within background features. To further enhance these representations, we design a \emph{foreground refinement module} (FRM) that distills a semantically enriched attention map from classification features to guide information aggregation. This allows tokens to adaptively reference semantically similar neighbors, thereby strengthening discrimination between foreground and background in complex scenes. Extensive experiments demonstrate that our method achieves a superior performance on small object detetcion. Our code will be released.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 2368
Loading