LargeRSDet: A Large Mini-Batch Object Detector for Remote Sensing Images

Huming Zhu, Qiuming Li, Kongmiao Miao, Jincheng Wang, Biao Hou, Licheng Jiao

Published: 2024, Last Modified: 13 Nov 2024IEEE Geosci. Remote. Sens. Lett. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Deep neural network models based on vision transformer (ViT) have shown unprecedented performance in the field of remote sensing image object detection. However, those models often require massive training data, which cost a lot of time to train and greatly prevent the research progress. Distributed training is a common way to accelerate the training period. In this letter, we propose a large batch object detector named LargeRSDet for remote sensing image object detection task, which can train with a batch size up to 1024 with only a little acceptable performance loss. Using the LargeRSDet, we can effectively utilize at most 1024 GPUs and greatly improve the training speed, which enables several benefits that not only help our model converge in a faster way but also provide the ability to reach a higher accuracy. Experimental results demonstrate that our method can finish training DIOR remote sensing image dataset in less than 5 min, and finally, the model achieves 75% mAP at 0.5.