Advancing Controllable Diffusion Model for Few-Shot Object Detection in Optical Remote Sensing Imagery

Tong Zhang, Yin Zhuang, Xinyi Zhang, Guanqun Wang, He Chen, Fukun Bi

Published: 01 Jan 2024, Last Modified: 05 Nov 2025IGARSS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Few-shot object detection (FSOD) from optical remote sensing imagery has to detect rare objects given only a few annotated bounding boxes. The limited training data is hard to represent the data distribution of realistic remote sensing scenes, restricting the performance of FSOD. Recently, learning conditional controls for text-to-image diffusion model has achieved great progress, which is capable of precisely generating the controllable yet imaginational images by text prompt and spatially localized input conditions. Accordingly, in this work, we aim to explore the potential of diffusion model and propose a solution for few-shot object detection by controllable data generation. Firstly, draw upon a few annotated objects, their bounding boxes and categories are respectively used as the spatial conditions and text prompts, then employ them into large text-to-image diffusion models for controlled image generation. Secondly, based the generated images, in order to adapt to the scale and orientation variances of remote sensing objects, a data transformation is devised for boosting the robustness of model training. Finally, some experiments were conducted on public remote sensing dataset DIOR, and the results proved its effectiveness.

External IDs:dblp:conf/igarss/ZhangZZWCB24