Abstract: Ship detection in remote sensing images has a wide range of research needs in both military and civilian applications. Traditional object detection algorithms rely solely on optical images as input, which are susceptible to various interferences such as cloud and fog occlusion, scene noise, etc., resulting in unsatisfactory detection accuracy. Based on the YOLO framework, this paper designs a ship detection model that integrates visible light and infrared images through dual-modal fusion, and explores various feature fusion methods. It achieves data-level fusion network SwinFuse-Ynet, feature-level fusion network E-Dualswin-Ynet and decision-level fusion network L-Dualswin-Ynet. Comprehensive validations are conducted on the multispectral satellite image datasets MMShip and VI-ship. The experimental results demonstrate that the three proposed networks effectively complement infrared feature information, reduce interference from complex scene factors, and improve detection accuracy. In particular, E-Dualswin-Ynet improved by introducing SwinTransformer and Wise-IoU can more fully amplify the complementarity between infrared and visible light modalities, obviously better than other single-modal detection methods.
Loading